linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-10-08	selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit	Kees Cook
	Some archs (like powerpc) only support changing the return code during syscall exit when ptrace is used. Test entry vs exit phases for which portions of the syscall number and return values need to be set at which different phases. For non-powerpc, all changes are made during ptrace syscall entry, as before. For powerpc, the syscall number is changed at ptrace syscall entry and the syscall return value is changed on ptrace syscall exit. Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Suggested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-cascardo@canonical.com/ Fixes: 58d0a862f573 ("seccomp: add tests for ptrace hole") Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921075300.7iylzof2w5vrutah@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/seccomp: Allow syscall nr and ret value to be set separately	Kees Cook
	In preparation for setting syscall nr and ret values separately, refactor the helpers to take a pointer to a value, so that a NULL can indicate "do not change this respective value". This is done to keep the regset read/write happening once and in one code path. Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921075031.j4gruygeugkp2zwd@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/seccomp: Record syscall during ptrace entry	Kees Cook
	In preparation for performing actions during ptrace syscall exit, save the syscall number during ptrace syscall entry. Some architectures do no have the syscall number available during ptrace syscall exit. Suggested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-cascardo@canonical.com/ Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Link: https://lore.kernel.org/lkml/20200921074354.6shkt2e5yhzhj3sn@wittgenstein/ Signed-off-by: Kees Cook <keescook@chromium.org>
2020-10-08	selftests/ftrace: Add test case for synthetic event dynamic strings	Tom Zanussi
	Add a selftest that defines and traces a synthetic event that uses a dynamic string event field. Link: https://lkml.kernel.org/r/74445afb005046d76d59fb06696a2ceaa164dec9.1601848695.git.zanussi@kernel.org Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2020-10-08	ACPICA: Tree-wide: fix various typos and spelling mistakes	Colin Ian King
	ACPICA commit 6648a6ac8410813bcfedb5c8345259dd155ea851 Fix spelling issues found using the codespell checker Link: https://github.com/acpica/acpica/commit/6648a6ac Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Erik Kaneda <erik.kaneda@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-07	selftests/bpf: Validate libbpf's auto-sizing of LD/ST/STX instructions	Andrii Nakryiko
	Add selftests validating libbpf's auto-resizing of load/store instructions when used with CO-RE relocations. An explicit and manual approach with using bpf_core_read() is also demonstrated and tested. Separate BPF program is supposed to fail due to using signed integers of sizes that differ from kernel's sizes. To reliably simulate 32-bit BTF (i.e., the one with sizeof(long) == sizeof(void *) == 4), selftest generates its own custom BTF and passes it as a replacement for real kernel BTF. This allows to test 32/64-bitness mix on all architectures. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-5-andrii@kernel.org
2020-10-07	libbpf: Allow specifying both ELF and raw BTF for CO-RE BTF override	Andrii Nakryiko
	Use generalized BTF parsing logic, making it possible to parse BTF both from ELF file, as well as a raw BTF dump. This makes it easier to write custom tests with manually generated BTFs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-4-andrii@kernel.org
2020-10-07	libbpf: Support safe subset of load/store instruction resizing with CO-RE	Andrii Nakryiko
	Add support for patching instructions of the following form: - rX = (T )(rY + <off>); - (T )(rX + <off>) = rY; - (T )(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}. For such instructions, if the actual kernel field recorded in CO-RE relocation has a different size than the one recorded locally (e.g., from vmlinux.h), then libbpf will adjust T to an appropriate 1-, 2-, 4-, or 8-byte loads. In general, such transformation is not always correct and could lead to invalid final value being loaded or stored. But two classes of cases are always safe: - if both local and target (kernel) types are unsigned integers, but of different sizes, then it's OK to adjust load/store instruction according to the necessary memory size. Zero-extending nature of such instructions and unsignedness make sure that the final value is always correct; - pointer size mismatch between BPF target architecture (which is always 64-bit) and 32-bit host kernel architecture can be similarly resolved automatically, because pointer is essentially an unsigned integer. Loading 32-bit pointer into 64-bit BPF register with zero extension will leave correct pointer in the register. Both cases are necessary to support CO-RE on 32-bit kernels, as `unsigned long` in vmlinux.h generated from 32-bit kernel is 32-bit, but when compiled with BPF program for BPF target it will be treated by compiler as 64-bit integer. Similarly, pointers in vmlinux.h are 32-bit for kernel, but treated as 64-bit values by compiler for BPF target. Both problems are now resolved by libbpf for direct memory reads. But similar transformations are useful in general when kernel fields are "resized" from, e.g., unsigned int to unsigned long (or vice versa). Now, similar transformations for signed integers are not safe to perform as they will result in incorrect sign extension of the value. If such situation is detected, libbpf will emit helpful message and will poison the instruction. Not failing immediately means that it's possible to guard the instruction based on kernel version (or other conditions) and make sure it's not reachable. If there is a need to read signed integers that change sizes between different kernels, it's possible to use BPF_CORE_READ_BITFIELD() macro, which works both with bitfields and non-bitfield integers of any signedness and handles sign-extension properly. Also, bpf_core_read() with proper size and/or use of bpf_core_field_size() relocation could allow to deal with such complicated situations explicitly, if not so conventiently as direct memory reads. Selftests added in a separate patch in progs/test_core_autosize.c demonstrate both direct memory and probed use cases. BPF_CORE_READ() is not changed and it won't deal with such situations as automatically as direct memory reads due to the signedness integer limitations, which are much harder to detect and control with compiler macro magic. So it's encouraged to utilize direct memory reads as much as possible. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-3-andrii@kernel.org
2020-10-07	libbpf: Skip CO-RE relocations for not loaded BPF programs	Andrii Nakryiko
	Bypass CO-RE relocations step for BPF programs that are not going to be loaded. This allows to have BPF programs compiled in and disabled dynamically if kernel is not supposed to provide enough relocation information. In such case, there won't be unnecessary warnings about failed relocations. Fixes: d929758101fc ("libbpf: Support disabling auto-loading BPF programs") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201008001025.292064-2-andrii@kernel.org
2020-10-07	tools/power/x86/intel-speed-select: Update version for v5.10	Srinivas Pandruvada
	Update version for changes released with v5.10 kernel release. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2020-10-07	tools/power/x86/intel-speed-select: Fix missing base-freq core IDs	Jonathan Doman
	The reported base-freq high-priority-cpu-list was potentially omitting some cpus, due to incorrectly using a logical core count to constrain the size of a physical punit core ID mask. We may need to read both high and low PBF CORE_MASK values regardless of the logical core count. Signed-off-by: Jonathan Doman <jonathan.doman@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2020-10-07	libbpf: Fix compatibility problem in xsk_socket__create	Magnus Karlsson
	Fix a compatibility problem when the old XDP_SHARED_UMEM mode is used together with the xsk_socket__create() call. In the old XDP_SHARED_UMEM mode, only sharing of the same device and queue id was allowed, and in this mode, the fill ring and completion ring were shared between the AF_XDP sockets. Therefore, it was perfectly fine to call the xsk_socket__create() API for each socket and not use the new xsk_socket__create_shared() API. This behavior was ruined by the commit introducing XDP_SHARED_UMEM support between different devices and/or queue ids. This patch restores the ability to use xsk_socket__create in these circumstances so that backward compatibility is not broken. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/1602070946-11154-1-git-send-email-magnus.karlsson@gmail.com
2020-10-07	bpf: Fix typo in uapi/linux/bpf.h	Jakub Wilk
	Reported-by: Samanta Navarro <ferivoz@riseup.net> Signed-off-by: Jakub Wilk <jwilk@jwilk.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20201007055717.7319-1-jwilk@jwilk.net
2020-10-07	selftests/run_kselftest.sh: Make each test individually selectable	Kees Cook
	Currently with run_kselftest.sh there is no way to choose which test we could run. All the tests listed in kselftest-list.txt are all run every time. This patch enhanced the run_kselftest.sh to make the test collections (or tests) individually selectable. e.g.: $ ./run_kselftest.sh -c seccomp -t timers:posix_timers -t timers:nanosleep Additionally adds a way to list all known tests with "-l", usage with "-h", and perform a dry run without running tests with "-n". Co-developed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-10-07	selftests: Extract run_kselftest.sh and generate stand-alone test list	Kees Cook
	Instead of building a script on the fly (which just repeats the same thing for each test collection), move the script out of the Makefile and into run_kselftest.sh, which reads kselftest-list.txt. Adjust the emit_tests target to report each test on a separate line so that test running tools (e.g. LAVA) can easily remove individual tests (for example, as seen in [1]). [1] https://github.com/Linaro/test-definitions/pull/208/commits/2e7b62155e4998e54ac0587704932484d4ff84c8 Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2020-10-07	Merge branch 'for-next/late-arrivals' into for-next/core	Will Deacon
	Late patches for 5.10: MTE selftests, minor KCSAN preparation and removal of some unused prototypes. (Amit Daniel Kachhap and others) * for-next/late-arrivals: arm64: random: Remove no longer needed prototypes arm64: initialize per-cpu offsets earlier kselftest/arm64: Check mte tagged user address in kernel kselftest/arm64: Verify KSM page merge for MTE pages kselftest/arm64: Verify all different mmap MTE options kselftest/arm64: Check forked child mte memory accessibility kselftest/arm64: Verify mte tag inclusion via prctl kselftest/arm64: Add utilities and a test to validate mte memory
2020-10-07	ida: Free allocated bitmap in error path	Matthew Wilcox (Oracle)
	If a bitmap needs to be allocated, and then by the time the thread is scheduled to be run again all the indices which would satisfy the allocation have been allocated then we would leak the allocation. Almost impossible to hit in practice, but a trivial fix. Found by Coverity. Fixes: f32f004cddf8 ("ida: Convert to XArray") Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-07	radix tree test suite: Fix compilation	Matthew Wilcox (Oracle)
	Introducing local_lock broke compilation; fix it all up. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2020-10-07	perf stat: Fix out of bounds CPU map access when handling armv8_pmu events	Namhyung Kim
	It was reported that 'perf stat' crashed when using with armv8_pmu (CPU) events with the task mode. As 'perf stat' uses an empty cpu map for task mode but armv8_pmu has its own cpu mask, it has confused which map it should use when accessing file descriptors and this causes segfaults: (gdb) bt #0 0x0000000000603fc8 in perf_evsel__close_fd_cpu (evsel=<optimized out>, cpu=<optimized out>) at evsel.c:122 #1 perf_evsel__close_cpu (evsel=evsel@entry=0x716e950, cpu=7) at evsel.c:156 #2 0x00000000004d4718 in evlist__close (evlist=0x70a7cb0) at util/evlist.c:1242 #3 0x0000000000453404 in __run_perf_stat (argc=3, argc@entry=1, argv=0x30, argv@entry=0xfffffaea2f90, run_idx=119, run_idx@entry=1701998435) at builtin-stat.c:929 #4 0x0000000000455058 in run_perf_stat (run_idx=1701998435, argv=0xfffffaea2f90, argc=1) at builtin-stat.c:947 #5 cmd_stat (argc=1, argv=0xfffffaea2f90) at builtin-stat.c:2357 #6 0x00000000004bb888 in run_builtin (p=p@entry=0x9764b8 <commands+288>, argc=argc@entry=4, argv=argv@entry=0xfffffaea2f90) at perf.c:312 #7 0x00000000004bbb54 in handle_internal_command (argc=argc@entry=4, argv=argv@entry=0xfffffaea2f90) at perf.c:364 #8 0x0000000000435378 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:408 #9 main (argc=4, argv=0xfffffaea2f90) at perf.c:538 To fix this, I simply used the given cpu map unless the evsel actually is not a system-wide event (like uncore events). Fixes: 7736627b865d ("perf stat: Use affinity for closing file descriptors") Reported-by: Wei Li <liwei391@huawei.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Barry Song <song.bao.hua@hisilicon.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201007081311.1831003-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-10-06	selftests/bpf: Fix test_verifier after introducing resolve_pseudo_ldimm64	Hao Luo
	Commit 4976b718c355 ("bpf: Introduce pseudo_btf_id") switched the order of check_subprogs() and resolve_pseudo_ldimm() in the verifier. Now an empty prog expects to see the error "last insn is not an the prog of a single invalid ldimm exit or jmp" instead, because the check for subprogs comes first. It's now pointless to validate that half of ldimm64 won't be the last instruction. Tested: # ./test_verifier Summary: 1129 PASSED, 537 SKIPPED, 0 FAILED and the full set of bpf selftests. Fixes: 4976b718c355 ("bpf: Introduce pseudo_btf_id") Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20201007022857.2791884-1-haoluo@google.com
2020-10-06	bpf, libbpf: Use valid btf in bpf_program__set_attach_target	Luigi Rizzo
	bpf_program__set_attach_target(prog, fd, ...) will always fail when fd = 0 (attach to a kernel symbol) because obj->btf_vmlinux is NULL and there is no way to set it (at the moment btf_vmlinux is meant to be temporary storage for use in bpf_object__load_xattr()). Fix this by using libbpf_find_vmlinux_btf_id(). At some point we may want to opportunistically cache btf_vmlinux so it can be reused with multiple programs. Signed-off-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Petar Penkov <ppenkov@google.com> Link: https://lore.kernel.org/bpf/20201005224528.389097-1-lrizzo@google.com
2020-10-06	selftest/bpf: Test pinning map with reused map fd	Hangbin Liu
	This add a test to make sure that we can still pin maps with reused map fd. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201006021345.3817033-4-liuhangbin@gmail.com
2020-10-06	libbpf: Check if pin_path was set even map fd exist	Hangbin Liu
	Say a user reuse map fd after creating a map manually and set the pin_path, then load the object via libbpf. In libbpf bpf_object__create_maps(), bpf_object__reuse_map() will return 0 if there is no pinned map in map->pin_path. Then after checking if map fd exist, we should also check if pin_path was set and do bpf_map__pin() instead of continue the loop. Fix it by creating map if fd not exist and continue checking pin_path after that. Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201006021345.3817033-3-liuhangbin@gmail.com
2020-10-06	libbpf: Close map fd if init map slots failed	Hangbin Liu
	Previously we forgot to close the map fd if bpf_map_update_elem() failed during map slot init, which will leak map fd. Let's move map slot initialization to new function init_map_slots() to simplify the code. And close the map fd if init slot failed. Reported-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20201006021345.3817033-2-liuhangbin@gmail.com
2020-10-06	objtool: Allow nested externs to enable BUILD_BUG()	Vasily Gorbik
	Currently BUILD_BUG() macro is expanded to smth like the following: do { extern void __compiletime_assert_0(void) __attribute__((error("BUILD_BUG failed"))); if (!(!(1))) __compiletime_assert_0(); } while (0); If used in a function body this obviously would produce build errors with -Wnested-externs and -Werror. Build objtool with -Wno-nested-externs to enable BUILD_BUG() usage. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
2020-10-06	selftests/powerpc: Add a rtas_filter selftest	Andrew Donnellan
	Add a selftest to test the basic functionality of CONFIG_RTAS_FILTER. Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com> [mpe: Change rmo_start/end to 32-bit to avoid build errors on ppc64] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200820044512.7543-2-ajd@linux.ibm.com
2020-10-06	x86/copy_mc: Introduce copy_mc_enhanced_fast_string()	Dan Williams
	The motivations to go rework memcpy_mcsafe() are that the benefit of doing slow and careful copies is obviated on newer CPUs, and that the current opt-in list of CPUs to instrument recovery is broken relative to those CPUs. There is no need to keep an opt-in list up to date on an ongoing basis if pmem/dax operations are instrumented for recovery by default. With recovery enabled by default the old "mcsafe_key" opt-in to careful copying can be made a "fragile" opt-out. Where the "fragile" list takes steps to not consume poison across cachelines. The discussion with Linus made clear that the current "_mcsafe" suffix was imprecise to a fault. The operations that are needed by pmem/dax are to copy from a source address that might throw #MC to a destination that may write-fault, if it is a user page. So copy_to_user_mcsafe() becomes copy_mc_to_user() to indicate the separate precautions taken on source and destination. copy_mc_to_kernel() is introduced as a non-SMAP version that does not expect write-faults on the destination, but is still prepared to abort with an error code upon taking #MC. The original copy_mc_fragile() implementation had negative performance implications since it did not use the fast-string instruction sequence to perform copies. For this reason copy_mc_to_kernel() fell back to plain memcpy() to preserve performance on platforms that did not indicate the capability to recover from machine check exceptions. However, that capability detection was not architectural and now that some platforms can recover from fast-string consumption of memory errors the memcpy() fallback now causes these more capable platforms to fail. Introduce copy_mc_enhanced_fast_string() as the fast default implementation of copy_mc_to_kernel() and finalize the transition of copy_mc_fragile() to be a platform quirk to indicate 'copy-carefully'. With this in place, copy_mc_to_kernel() is fast and recovery-ready by default regardless of hardware capability. Thanks to Vivek for identifying that copy_user_generic() is not suitable as the copy_mc_to_user() backend since the #MC handler explicitly checks ex_has_fault_handler(). Thanks to the 0day robot for catching a performance bug in the x86/copy_mc_to_user implementation. [ bp: Add the "why" for this change from the 0/2th message, massage. ] Fixes: 92b0729c34ca ("x86/mm, x86/mce: Add memcpy_mcsafe()") Reported-by: Erwin Tsaur <erwin.tsaur@intel.com> Reported-by: 0day robot <lkp@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Tested-by: Erwin Tsaur <erwin.tsaur@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/160195562556.2163339.18063423034951948973.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-06	x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()	Dan Williams
	In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	David S. Miller
	Rejecting non-native endian BTF overlapped with the addition of support for it. The rest were more simple overlapping changes, except the renesas ravb binding update, which had to follow a file move as well as a YAML conversion. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-05	lib/scatterlist: Add support in dynamic allocation of SG table from pages	Maor Gottlieb
	Extend __sg_alloc_table_from_pages to support dynamic allocation of SG table from pages. It should be used by drivers that can't supply all the pages at one time. This function returns the last populated SGE in the table. Users should pass it as an argument to the function from the second call and forward. As before, nents will be equal to the number of populated SGEs (chunks). With this new extension, drivers can benefit the optimization of merging contiguous pages without a need to allocate all pages in advance and hold them in a large buffer. E.g. with the Infiniband driver that allocates a single page for hold the pages. For 1TB memory registration, the temporary buffer would consume only 4KB, instead of 2GB. Link: https://lore.kernel.org/r/20201004154340.1080481-2-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-05	tools/testing/scatterlist: Show errors in human readable form	Tvrtko Ursulin
	Instead of just asserting dump some more useful info about what the test saw versus what it expected to see. Link: https://lore.kernel.org/r/20201004154340.1080481-4-leon@kernel.org Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-05	tools/testing/scatterlist: Rejuvenate bit-rotten test	Tvrtko Ursulin
	A couple small tweaks are needed to make the test build and run on current kernels. Link: https://lore.kernel.org/r/20201004154340.1080481-3-leon@kernel.org Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Linus Torvalds
	Pull networking fixes from David Miller: 1) Make sure SKB control block is in the proper state during IPSEC ESP-in-TCP encapsulation. From Sabrina Dubroca. 2) Various kinds of attributes were not being cloned properly when we build new xfrm_state objects from existing ones. Fix from Antony Antony. 3) Make sure to keep BTF sections, from Tony Ambardar. 4) TX DMA channels need proper locking in lantiq driver, from Hauke Mehrtens. 5) Honour route MTU during forwarding, always. From Maciej Żenczykowski. 6) Fix races in kTLS which can result in crashes, from Rohit Maheshwari. 7) Skip TCP DSACKs with rediculous sequence ranges, from Priyaranjan Jha. 8) Use correct address family in xfrm state lookups, from Herbert Xu. 9) A bridge FDB flush should not clear out user managed fdb entries with the ext_learn flag set, from Nikolay Aleksandrov. 10) Fix nested locking of netdev address lists, from Taehee Yoo. 11) Fix handling of 32-bit DATA_FIN values in mptcp, from Mat Martineau. 12) Fix r8169 data corruptions on RTL8402 chips, from Heiner Kallweit. 13) Don't free command entries in mlx5 while comp handler could still be running, from Eran Ben Elisha. 14) Error flow of request_irq() in mlx5 is busted, due to an off by one we try to free and IRQ never allocated. From Maor Gottlieb. 15) Fix leak when dumping netlink policies, from Johannes Berg. 16) Sendpage cannot be performed when a page is a slab page, or the page count is < 1. Some subsystems such as nvme were doing so. Create a "sendpage_ok()" helper and use it as needed, from Coly Li. 17) Don't leak request socket when using syncookes with mptcp, from Paolo Abeni. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits) net/core: check length before updating Ethertype in skb_mpls_{push,pop} net: mvneta: fix double free of txq->buf net_sched: check error pointer in tcf_dump_walker() net: team: fix memory leak in __team_options_register net: typhoon: Fix a typo Typoon --> Typhoon net: hinic: fix DEVLINK build errors net: stmmac: Modify configuration method of EEE timers tcp: fix syn cookied MPTCP request socket leak libceph: use sendpage_ok() in ceph_tcp_sendpage() scsi: libiscsi: use sendpage_ok() in iscsi_tcp_segment_map() drbd: code cleanup by using sendpage_ok() to check page for kernel_sendpage() tcp: use sendpage_ok() to detect misused .sendpage nvme-tcp: check page by sendpage_ok() before calling kernel_sendpage() net: add WARN_ONCE in kernel_sendpage() for improper zero-copy send net: introduce helper sendpage_ok() in include/linux/net.h net: usb: pegasus: Proper error handing when setting pegasus' MAC address net: core: document two new elements of struct net_device netlink: fix policy dump leak net/mlx5e: Fix race condition on nhe->n pointer in neigh update net/mlx5e: Fix VLAN create flow ...
2020-10-05	kselftest/arm64: Check mte tagged user address in kernel	Amit Daniel Kachhap
	Add a testcase to check that user address with valid/invalid mte tag works in kernel mode. This test verifies that the kernel API's __arch_copy_from_user/__arch_copy_to_user works by considering if the user pointer has valid/invalid allocation tags. In MTE sync mode, file memory read/write and other similar interfaces fails if a user memory with invalid tag is accessed in kernel. In async mode no such failure occurs. Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-7-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Verify KSM page merge for MTE pages	Amit Daniel Kachhap
	Add a testcase to check that KSM should not merge pages containing same data with same/different MTE tag values. This testcase has one positive tests and passes if page merging happens according to the above rule. It also saves and restores any modified ksm sysfs entries. Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-6-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Verify all different mmap MTE options	Amit Daniel Kachhap
	This testcase checks the different unsupported/supported options for mmap if used with PROT_MTE memory protection flag. These checks are, * Either pstate.tco enable or prctl PR_MTE_TCF_NONE option should not cause any tag mismatch faults. * Different combinations of anonymous/file memory mmap, mprotect, sync/async error mode and private/shared mappings should work. * mprotect should not be able to clear the PROT_MTE page property. Co-developed-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-5-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Check forked child mte memory accessibility	Amit Daniel Kachhap
	This test covers the mte memory behaviour of the forked process with different mapping properties and flags. It checks that all bytes of forked child memory are accessible with the same tag as that of the parent and memory accessed outside the tag range causes fault to occur. Co-developed-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-4-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Verify mte tag inclusion via prctl	Amit Daniel Kachhap
	This testcase verifies that the tag generated with "irg" instruction contains only included tags. This is done via prtcl call. This test covers 4 scenarios, * At least one included tag. * More than one included tags. * All included. * None included. Co-developed-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-3-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	kselftest/arm64: Add utilities and a test to validate mte memory	Amit Daniel Kachhap
	This test checks that the memory tag is present after mte allocation and the memory is accessible with those tags. This testcase verifies all sync, async and none mte error reporting mode. The allocated mte buffers are verified for Allocated range (no error expected while accessing buffer), Underflow range, and Overflow range. Different test scenarios covered here are, * Verify that mte memory are accessible at byte/block level. * Force underflow and overflow to occur and check the data consistency. * Check to/from between tagged and untagged memory. * Check that initial allocated memory to have 0 tag. This change also creates the necessary infrastructure to add mte test cases. MTE kselftests can use the several utility functions provided here to add wide variety of mte test scenarios. GCC compiler need flag '-march=armv8.5-a+memtag' so those flags are verified before compilation. The mte testcases can be launched with kselftest framework as, make TARGETS=arm64 ARM64_SUBTARGETS=mte kselftest or compiled as, make -C tools/testing/selftests TARGETS=arm64 ARM64_SUBTARGETS=mte CC='compiler' Co-developed-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Gabor Kertesz <gabor.kertesz@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201002115630.24683-2-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-10-05	test_firmware: Test partial read support	Scott Branden
	Add additional hooks to test_firmware to pass in support for partial file read using request_firmware_into_buf(): buf_size: size of buffer to request firmware into partial: indicates that a partial file request is being made file_offset: to indicate offset into file to request Also update firmware selftests to use the new partial read test API. Signed-off-by: Scott Branden <scott.branden@broadcom.com> Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20201002173828.2099543-17-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-05	Merge 5.9-rc8 into staging-next	Greg Kroah-Hartman
	We need the IIO fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-03	mm: remove compat_process_vm_{readv,writev}	Christoph Hellwig
	Now that import_iovec handles compat iovecs, the native syscalls can be used for the compat case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03	fs: remove compat_sys_vmsplice	Christoph Hellwig
	Now that import_iovec handles compat iovecs, the native vmsplice syscall can be used for the compat case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-03	fs: remove the compat readv/writev syscalls	Christoph Hellwig
	Now that import_iovec handles compat iovecs, the native readv and writev syscalls can be used for the compat case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-10-02	selftests: ocelot: add some example VCAP IS1, IS2 and ES0 tc offloads	Vladimir Oltean
	Provide an example script which can be used as a skeleton for offloading TCAM rules in the Ocelot switches. Not all actions are demoed, mostly because of difficulty to automate this from a single board. For example, policing. We can set up an iperf3 UDP server and client and measure throughput at destination. But at least with DSA setups, network namespacing the individual ports is not possible because all switch ports are handled by the same DSA master. And we cannot assume that the target platform (an embedded board) has 2 other non-switch generator ports, we need to work with the generator ports as switch ports (this is the reason why mausezahn is used, and not IP traffic like ping). When somebody has an idea how to test policing, that can be added to this test. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-02	bpf, sockmap: Update selftests to use skb_adjust_room	John Fastabend
	Instead of working around TLS headers in sockmap selftests use the new skb_adjust_room helper. This allows us to avoid special casing the receive side to skip headers. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/160160100932.7052.3646935243867660528.stgit@john-Precision-5820-Tower
2020-10-02	bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr()	Hao Luo
	Test bpf_per_cpu_ptr() and bpf_this_cpu_ptr(). Test two paths in the kernel. If the base pointer points to a struct, the returned reg is of type PTR_TO_BTF_ID. Direct pointer dereference can be applied on the returned variable. If the base pointer isn't a struct, the returned reg is of type PTR_TO_MEM, which also supports direct pointer dereference. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-7-haoluo@google.com
2020-10-02	bpf: Introducte bpf_this_cpu_ptr()	Hao Luo
	Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This helper always returns a valid pointer, therefore no need to check returned value for NULL. Also note that all programs run with preemption disabled, which means that the returned pointer is stable during all the execution of the program. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-6-haoluo@google.com
2020-10-02	bpf: Introduce bpf_per_cpu_ptr()	Hao Luo
	Add bpf_per_cpu_ptr() to help bpf programs access percpu vars. bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel except that it may return NULL. This happens when the cpu parameter is out of range. So the caller must check the returned value. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-5-haoluo@google.com
2020-10-02	selftests/bpf: Ksyms_btf to test typed ksyms	Hao Luo
	Selftests for typed ksyms. Tests two types of ksyms: one is a struct, the other is a plain int. This tests two paths in the kernel. Struct ksyms will be converted into PTR_TO_BTF_ID by the verifier while int typed ksyms will be converted into PTR_TO_MEM. Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200929235049.2533242-4-haoluo@google.com