summaryrefslogtreecommitdiff
path: root/include/linux/uaccess.h
AgeCommit message (Collapse)Author
2025-12-06Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c - "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations - "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO - "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day - "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions * tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...
2025-11-20uaccess: gate _copy_[to|from]_user on !INLINE_COPY_FROM_USERAlice Ryhl
These methods only exist when INLINE_COPY_FROM_USER is disabled, so update the header file to reflect that. This fixes the following error on builds that enable both RUST and INLINE_COPY_FROM_USER. ERROR: modpost: "_copy_from_user" [samples/rust/rust_misc_device.ko] undefined! ERROR: modpost: "_copy_to_user" [samples/rust/rust_misc_device.ko] undefined! This error is triggered because when a method is available both as a rust_helper_* and normal method, Rust will call the normal method. [akpm@linux-foundation.org: s/INLINE_COPY_FROM_USER/INLINE_COPY_TO_USER/, per Alice] Link: https://lkml.kernel.org/r/20251118173250.2821388-1-aliceryhl@google.com Fixes: d99dc586ca7c ("uaccess: decouple INLINE_COPY_FROM_USER and CONFIG_RUST") Signed-off-by: Alice Ryhl <aliceryhl@google.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Trevor Gross <tmgross@umich.edu> Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-12uaccess: decouple INLINE_COPY_FROM_USER and CONFIG_RUSTYury Norov (NVIDIA)
Commit 1f9a8286bc0c ("uaccess: always export _copy_[from|to]_user with CONFIG_RUST") exports _copy_{from,to}_user() unconditionally, if RUST is enabled. This pollutes exported symbols namespace, and spreads RUST ifdefery in core files. It's better to declare a corresponding helper under the rust/helpers, similarly to how non-underscored copy_{from,to}_user() is handled. [yury.norov@gmail.com: drop rust part of comment for _copy_from_user(), per Alice] Link: https://lkml.kernel.org/r/20251024154754.99768-1-yury.norov@gmail.com Link: https://lkml.kernel.org/r/20251023171607.1171534-1-yury.norov@gmail.com Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Tested-by: Alice Ryhl <aliceryhl@google.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Trevor Gross <tmgross@umich.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-04uaccess: Provide put/get_user_inline()Thomas Gleixner
Provide convenience wrappers around scoped user access similar to put/get_user(), which reduce the usage sites to: if (!get_user_inline(val, ptr)) return -EFAULT; Should only be used if there is a demonstrable performance benefit. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20251027083745.609031602@linutronix.de
2025-11-04uaccess: Provide scoped user access regionsThomas Gleixner
User space access regions are tedious and require similar code patterns all over the place: if (!user_read_access_begin(from, sizeof(*from))) return -EFAULT; unsafe_get_user(val, from, Efault); user_read_access_end(); return 0; Efault: user_read_access_end(); return -EFAULT; This got worse with the recent addition of masked user access, which optimizes the speculation prevention: if (can_do_masked_user_access()) from = masked_user_read_access_begin((from)); else if (!user_read_access_begin(from, sizeof(*from))) return -EFAULT; unsafe_get_user(val, from, Efault); user_read_access_end(); return 0; Efault: user_read_access_end(); return -EFAULT; There have been issues with using the wrong user_*_access_end() variant in the error path and other typical Copy&Pasta problems, e.g. using the wrong fault label in the user accessor which ends up using the wrong accesss end variant. These patterns beg for scopes with automatic cleanup. The resulting outcome is: scoped_user_read_access(from, Efault) unsafe_get_user(val, from, Efault); return 0; Efault: return -EFAULT; The scope guarantees the proper cleanup for the access mode is invoked both in the success and the failure (fault) path. The scoped_user_$MODE_access() macros are implemented as self terminating nested for() loops. Thanks to Andrew Cooper for pointing me at them. The scope can therefore be left with 'break', 'goto' and 'return'. Even 'continue' "works" due to the self termination mechanism. Both GCC and clang optimize all the convoluted macro maze out and the above results with clang in: b80: f3 0f 1e fa endbr64 b84: 48 b8 ef cd ab 89 67 45 23 01 movabs $0x123456789abcdef,%rax b8e: 48 39 c7 cmp %rax,%rdi b91: 48 0f 47 f8 cmova %rax,%rdi b95: 90 nop b96: 90 nop b97: 90 nop b98: 31 c9 xor %ecx,%ecx b9a: 8b 07 mov (%rdi),%eax b9c: 89 06 mov %eax,(%rsi) b9e: 85 c9 test %ecx,%ecx ba0: 0f 94 c0 sete %al ba3: 90 nop ba4: 90 nop ba5: 90 nop ba6: c3 ret Which looks as compact as it gets. The NOPs are placeholder for STAC/CLAC. GCC emits the fault path seperately: bf0: f3 0f 1e fa endbr64 bf4: 48 b8 ef cd ab 89 67 45 23 01 movabs $0x123456789abcdef,%rax bfe: 48 39 c7 cmp %rax,%rdi c01: 48 0f 47 f8 cmova %rax,%rdi c05: 90 nop c06: 90 nop c07: 90 nop c08: 31 d2 xor %edx,%edx c0a: 8b 07 mov (%rdi),%eax c0c: 89 06 mov %eax,(%rsi) c0e: 85 d2 test %edx,%edx c10: 75 09 jne c1b <afoo+0x2b> c12: 90 nop c13: 90 nop c14: 90 nop c15: b8 01 00 00 00 mov $0x1,%eax c1a: c3 ret c1b: 90 nop c1c: 90 nop c1d: 90 nop c1e: 31 c0 xor %eax,%eax c20: c3 ret The fault labels for the scoped*() macros and the fault labels for the actual user space accessors can be shared and must be placed outside of the scope. If masked user access is enabled on an architecture, then the pointer handed in to scoped_user_$MODE_access() can be modified to point to a guaranteed faulting user address. This modification is only scope local as the pointer is aliased inside the scope. When the scope is left the alias is not longer in effect. IOW the original pointer value is preserved so it can be used e.g. for fixup or diagnostic purposes in the fault path. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/20251027083745.546420421@linutronix.de
2025-11-03uaccess: Provide ASM GOTO safe wrappers for unsafe_*_user()Thomas Gleixner
ASM GOTO is miscompiled by GCC when it is used inside a auto cleanup scope: bool foo(u32 __user *p, u32 val) { scoped_guard(pagefault) unsafe_put_user(val, p, efault); return true; efault: return false; } e80: e8 00 00 00 00 call e85 <foo+0x5> e85: 65 48 8b 05 00 00 00 00 mov %gs:0x0(%rip),%rax e8d: 83 80 04 14 00 00 01 addl $0x1,0x1404(%rax) // pf_disable++ e94: 89 37 mov %esi,(%rdi) e96: 83 a8 04 14 00 00 01 subl $0x1,0x1404(%rax) // pf_disable-- e9d: b8 01 00 00 00 mov $0x1,%eax // success ea2: e9 00 00 00 00 jmp ea7 <foo+0x27> // ret ea7: 31 c0 xor %eax,%eax // fail ea9: e9 00 00 00 00 jmp eae <foo+0x2e> // ret which is broken as it leaks the pagefault disable counter on failure. Clang at least fails the build. Linus suggested to add a local label into the macro scope and let that jump to the actual caller supplied error label. __label__ local_label; \ arch_unsafe_get_user(x, ptr, local_label); \ if (0) { \ local_label: \ goto label; \ That works for both GCC and clang. clang: c80: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) c85: 65 48 8b 0c 25 00 00 00 00 mov %gs:0x0,%rcx c8e: ff 81 04 14 00 00 incl 0x1404(%rcx) // pf_disable++ c94: 31 c0 xor %eax,%eax // set retval to false c96: 89 37 mov %esi,(%rdi) // write c98: b0 01 mov $0x1,%al // set retval to true c9a: ff 89 04 14 00 00 decl 0x1404(%rcx) // pf_disable-- ca0: 2e e9 00 00 00 00 cs jmp ca6 <foo+0x26> // ret The exception table entry points correctly to c9a GCC: f70: e8 00 00 00 00 call f75 <baz+0x5> f75: 65 48 8b 05 00 00 00 00 mov %gs:0x0(%rip),%rax f7d: 83 80 04 14 00 00 01 addl $0x1,0x1404(%rax) // pf_disable++ f84: 8b 17 mov (%rdi),%edx f86: 89 16 mov %edx,(%rsi) f88: 83 a8 04 14 00 00 01 subl $0x1,0x1404(%rax) // pf_disable-- f8f: b8 01 00 00 00 mov $0x1,%eax // success f94: e9 00 00 00 00 jmp f99 <baz+0x29> // ret f99: 83 a8 04 14 00 00 01 subl $0x1,0x1404(%rax) // pf_disable-- fa0: 31 c0 xor %eax,%eax // fail fa2: e9 00 00 00 00 jmp fa7 <baz+0x37> // ret The exception table entry points correctly to f99 So both compilers optimize out the extra goto and emit correct and efficient code. Provide a generic wrapper to do that to avoid modifying all the affected architecture specific implementation with that workaround. The only change required for architectures is to rename unsafe_*_user() to arch_unsafe_*_user(). That's done in subsequent changes. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://patch.msgid.link/877bweujtn.ffs@tglx
2025-06-26uaccess: Define pagefault lock guardViktor Malik
Define a pagefault lock guard which allows to simplify functions that need to disable page faults. Signed-off-by: Viktor Malik <vmalik@redhat.com> Link: https://lore.kernel.org/r/8a01beb0b671923976f08297d81242bb2129881d.1750917800.git.vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-28uaccess: Introduce ucopysize.hKees Cook
The object size sanity checking macros that uaccess.h and uio.h use have been living in thread_info.h for historical reasons. Needing to use jump labels for these checks, however, introduces a header include loop under certain conditions. The dependencies for the object checking macros are very limited, but they are used by separate header files, so introduce a new header that can be used directly by uaccess.h and uio.h. As a result, this also means thread_info.h (which is rather large) and be removed from those headers. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202502281153.TG2XK5SI-lkp@intel.com/ Signed-off-by: Kees Cook <kees@kernel.org>
2024-11-18Merge tag 'vfs-6.13.usercopy' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull copy_struct_to_user helper from Christian Brauner: "This adds a copy_struct_to_user() helper which is a companion helper to the already widely used copy_struct_from_user(). It copies a struct from kernel space to userspace, in a way that guarantees backwards-compatibility for struct syscall arguments as long as future struct extensions are made such that all new fields are appended to the old struct, and zeroed-out new fields have the same meaning as the old struct. The first user is sched_getattr() system call but the new extensible pidfs ioctl will be ported to it as well" * tag 'vfs-6.13.usercopy' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: sched_getattr: port to copy_struct_to_user uaccess: add copy_struct_to_user helper
2024-10-30x86/uaccess: Avoid barrier_nospec() in 64-bit copy_from_user()Linus Torvalds
The barrier_nospec() in 64-bit copy_from_user() is slow. Instead use pointer masking to force the user pointer to all 1's for an invalid address. The kernel test robot reports a 2.6% improvement in the per_thread_ops benchmark [1]. This is a variation on a patch originally by Josh Poimboeuf [2]. Link: https://lore.kernel.org/202410281344.d02c72a2-oliver.sang@intel.com [1] Link: https://lore.kernel.org/5b887fe4c580214900e21f6c61095adf9a142735.1730166635.git.jpoimboe@kernel.org [2] Tested-and-reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-10-21uaccess: add copy_struct_to_user helperAleksa Sarai
This is based on copy_struct_from_user(), but there is one additional case to consider when creating a syscall that returns an extensible-struct to userspace -- how should data in the struct that cannot fit into the userspace struct be handled (ksize > usize)? There are three possibilies: 1. The interface is like sched_getattr(2), where new information will be silently not provided to userspace. This is probably what most interfaces will want to do, as it provides the most possible backwards-compatibility. 2. The interface is like lsm_list_modules(2), where you want to return an error like -EMSGSIZE if not providing information could result in the userspace program making a serious mistake (such as one that could lead to a security problem) or if you want to provide some flag to userspace so they know that they are missing some information. 3. The interface is like statx(2), where there some kind of a request mask that indicates what data userspace would like. One could imagine that statx2(2) (using extensible structs) would want to return -EMSGSIZE if the user explicitly requested a field that their structure is too small to fit, but not return an error if the field was not explicitly requested. This is kind of a mix between (1) and (2) based on the requested mask. The copy_struct_to_user() helper includes a an extra argument that is used to return a boolean flag indicating whether there was a non-zero byte in the trailing bytes that were not copied to userspace. This can be used in the following ways to handle all three cases, respectively: 1. Just pass NULL, as you don't care about this case. 2. Return an error (say -EMSGSIZE) if the argument was set to true by copy_struct_to_user(). 3. If the argument was set to true by copy_struct_to_user(), check if there is a flag that implies a field larger than usize. This is the only case where callers of copy_struct_to_user() should check usize themselves. This will probably require scanning an array that specifies what flags were added for each version of the flags struct and returning an error if the request mask matches any of the flags that were added in versions of the struct that are larger than usize. At the moment we don't have any users of (3), so this patch doesn't include any helpers to make the necessary scanning easier, but it should be fairly easy to add some if necessary. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> Link: https://lore.kernel.org/r/20241010-extensible-structs-check_fields-v3-1-d2833dfe6edd@cyphar.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-22Merge branch 'address-masking'Linus Torvalds
Merge user access fast validation using address masking. This allows architectures to optionally use a data dependent address masking model instead of a conditional branch for validating user accesses. That avoids the Spectre-v1 speculation barriers. Right now only x86-64 takes advantage of this, and not all architectures will be able to do it. It requires a guard region between the user and kernel address spaces (so that you can't overflow from one to the other), and an easy way to generate a guaranteed-to-fault address for invalid user pointers. Also note that this currently assumes that there is no difference between user read and write accesses. If extended to architectures like powerpc, we'll also need to separate out the user read-vs-write cases. * address-masking: x86: make the masked_user_access_begin() macro use its argument only once x86: do the user address masking outside the user access area x86: support user address masking instead of non-speculative conditional
2024-08-19x86: support user address masking instead of non-speculative conditionalLinus Torvalds
The Spectre-v1 mitigations made "access_ok()" much more expensive, since it has to serialize execution with the test for a valid user address. All the normal user copy routines avoid this by just masking the user address with a data-dependent mask instead, but the fast "unsafe_user_read()" kind of patterms that were supposed to be a fast case got slowed down. This introduces a notion of using src = masked_user_access_begin(src); to do the user address sanity using a data-dependent mask instead of the more traditional conditional if (user_read_access_begin(src, len)) { model. This model only works for dense accesses that start at 'src' and on architectures that have a guard region that is guaranteed to fault in between the user space and the kernel space area. With this, the user access doesn't need to be manually checked, because a bad address is guaranteed to fault (by some architecture masking trick: on x86-64 this involves just turning an invalid user address into all ones, since we don't map the top of address space). This only converts a couple of examples for now. Example x86-64 code generation for loading two words from user space: stac mov %rax,%rcx sar $0x3f,%rcx or %rax,%rcx mov (%rcx),%r13 mov 0x8(%rcx),%r14 clac where all the error handling and -EFAULT is now purely handled out of line by the exception path. Of course, if the micro-architecture does badly at 'clac' and 'stac', the above is still pitifully slow. But at least we did as well as we could. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-08uaccess: always export _copy_[from|to]_user with CONFIG_RUSTArnd Bergmann
Rust code needs to be able to access _copy_from_user and _copy_to_user so that it can skip the check_copy_size check in cases where the length is known at compile-time, mirroring the logic for when C code will skip check_copy_size. To do this, we ensure that exported versions of these methods are available when CONFIG_RUST is enabled. Alice has verified that this patch passes the CONFIG_TEST_USER_COPY test on x86 using the Android cuttlefish emulator. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20240528-alice-mm-v7-2-78222c31b8f4@google.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2023-03-16mm: Introduce untagged_addr_remote()Kirill A. Shutemov
untagged_addr() removes tags/metadata from the address and brings it to the canonical form. The helper is implemented on arm64 and sparc. Both of them do untagging based on global rules. However, Linear Address Masking (LAM) on x86 introduces per-process settings for untagging. As a result, untagged_addr() is now only suitable for untagging addresses for the current proccess. The new helper untagged_addr_remote() has to be used when the address targets remote process. It requires the mmap lock for target mm to be taken. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Alexander Potapenko <glider@google.com> Link: https://lore.kernel.org/all/20230312112612.31869-6-kirill.shutemov%40linux.intel.com
2023-02-08uaccess: Add minimum bounds check on kernel buffer sizeKees Cook
While there is logic about the difference between ksize and usize, copy_struct_from_user() didn't check the size of the destination buffer (when it was known) against ksize. Add this check so there is an upper bounds check on the possible memset() call, otherwise lower bounds checks made by callers will trigger bounds warnings under -Warray-bounds. Seen under GCC 13: In function 'copy_struct_from_user', inlined from 'iommufd_fops_ioctl' at ../drivers/iommu/iommufd/main.c:333:8: ../include/linux/fortify-string.h:59:33: warning: '__builtin_memset' offset [57, 4294967294] is out of the bounds [0, 56] of object 'buf' with type 'union ucmd_buffer' [-Warray-bounds=] 59 | #define __underlying_memset __builtin_memset | ^ ../include/linux/fortify-string.h:453:9: note: in expansion of macro '__underlying_memset' 453 | __underlying_memset(p, c, __fortify_size); \ | ^~~~~~~~~~~~~~~~~~~ ../include/linux/fortify-string.h:461:25: note: in expansion of macro '__fortify_memset_chk' 461 | #define memset(p, c, s) __fortify_memset_chk(p, c, s, \ | ^~~~~~~~~~~~~~~~~~~~ ../include/linux/uaccess.h:334:17: note: in expansion of macro 'memset' 334 | memset(dst + size, 0, rest); | ^~~~~~ ../drivers/iommu/iommufd/main.c: In function 'iommufd_fops_ioctl': ../drivers/iommu/iommufd/main.c:311:27: note: 'buf' declared here 311 | union ucmd_buffer buf; | ^~~ Cc: Christian Brauner <brauner@kernel.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Alexander Potapenko <glider@google.com> Acked-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/lkml/20230203193523.never.667-kees@kernel.org/
2022-10-03instrumented.h: allow instrumenting both sides of copy_from_user()Alexander Potapenko
Introduce instrument_copy_from_user_before() and instrument_copy_from_user_after() hooks to be invoked before and after the call to copy_from_user(). KASAN and KCSAN will be only using instrument_copy_from_user_before(), but for KMSAN we'll need to insert code after copy_from_user(). Link: https://lkml.kernel.org/r/20220915150417.722975-4-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-07No need of likely/unlikely on calls of check_copy_size()Al Viro
it's inline and unlikely() inside of it (including the implicit one in WARN_ON_ONCE()) suffice to convince the compiler that getting false from check_copy_size() is unlikely. Spotted-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-04-25mm: Add fault_in_subpage_writeable() to probe at sub-page granularityCatalin Marinas
On hardware with features like arm64 MTE or SPARC ADI, an access fault can be triggered at sub-page granularity. Depending on how the fault_in_writeable() function is used, the caller can get into a live-lock by continuously retrying the fault-in on an address different from the one where the uaccess failed. In the majority of cases progress is ensured by the following conditions: 1. copy_to_user_nofault() guarantees at least one byte access if the user address is not faulting. 2. The fault_in_writeable() loop is resumed from the first address that could not be accessed by copy_to_user_nofault(). If the loop iteration is restarted from an earlier (initial) point, the loop is repeated with the same conditions and it would live-lock. Introduce an arch-specific probe_subpage_writeable() and call it from the newly added fault_in_subpage_writeable() function. The arch code with sub-page faults will have to implement the specific probing functionality. Note that no other fault_in_subpage_*() functions are added since they have no callers currently susceptible to a live-lock. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20220423100751.1870771-2-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-03-23Merge tag 'asm-generic-5.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "There are three sets of updates for 5.18 in the asm-generic tree: - The set_fs()/get_fs() infrastructure gets removed for good. This was already gone from all major architectures, but now we can finally remove it everywhere, which loses some particularly tricky and error-prone code. There is a small merge conflict against a parisc cleanup, the solution is to use their new version. - The nds32 architecture ends its tenure in the Linux kernel. The hardware is still used and the code is in reasonable shape, but the mainline port is not actively maintained any more, as all remaining users are thought to run vendor kernels that would never be updated to a future release. - A series from Masahiro Yamada cleans up some of the uapi header files to pass the compile-time checks" * tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits) nds32: Remove the architecture uaccess: remove CONFIG_SET_FS ia64: remove CONFIG_SET_FS support sh: remove CONFIG_SET_FS support sparc64: remove CONFIG_SET_FS support lib/test_lockup: fix kernel pointer check for separate address spaces uaccess: generalize access_ok() uaccess: fix type mismatch warnings from access_ok() arm64: simplify access_ok() m68k: fix access_ok for coldfire MIPS: use simpler access_ok() MIPS: Handle address errors for accesses above CPU max virtual user address uaccess: add generic __{get,put}_kernel_nofault nios2: drop access_ok() check from __put_user() x86: use more conventional access_ok() definition x86: remove __range_not_ok() sparc64: add __{get,put}_kernel_nofault() nds32: fix access_ok() checks in get/put_user uaccess: fix nios2 and microblaze get_user_8() sparc64: fix building assembly files ...
2022-03-22mm: remove usercopy_warn()Christophe Leroy
Users of usercopy_warn() were removed by commit 53944f171a89 ("mm: remove HARDENED_USERCOPY_FALLBACK") Remove it. Link: https://lkml.kernel.org/r/5f26643fc70b05f8455b60b99c30c17d635fa640.1644231910.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Stephen Kitt <steve@sk2.org> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-25uaccess: remove CONFIG_SET_FSArnd Bergmann
There are no remaining callers of set_fs(), so CONFIG_SET_FS can be removed globally, along with the thread_info field and any references to it. This turns access_ok() into a cheaper check against TASK_SIZE_MAX. As CONFIG_SET_FS is now gone, drop all remaining references to set_fs()/get_fs(), mm_segment_t, user_addr_max() and uaccess_kernel(). Acked-by: Sam Ravnborg <sam@ravnborg.org> # for sparc32 changes Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Tested-by: Sergey Matyukevich <sergey.matyukevich@synopsys.com> # for arc changes Acked-by: Stafford Horne <shorne@gmail.com> # [openrisc, asm-generic] Acked-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25uaccess: generalize access_ok()Arnd Bergmann
There are many different ways that access_ok() is defined across architectures, but in the end, they all just compare against the user_addr_max() value or they accept anything. Provide one definition that works for most architectures, checking against TASK_SIZE_MAX for user processes or skipping the check inside of uaccess_kernel() sections. For architectures without CONFIG_SET_FS(), this should be the fastest check, as it comes down to a single comparison of a pointer against a compile-time constant, while the architecture specific versions tend to do something more complex for historic reasons or get something wrong. Type checking for __user annotations is handled inconsistently across architectures, but this is easily simplified as well by using an inline function that takes a 'const void __user *' argument. A handful of callers need an extra __user annotation for this. Some architectures had trick to use 33-bit or 65-bit arithmetic on the addresses to calculate the overflow, however this simpler version uses fewer registers, which means it can produce better object code in the end despite needing a second (statically predicted) branch. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64, asm-generic] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25uaccess: add generic __{get,put}_kernel_nofaultArnd Bergmann
Nine architectures are still missing __{get,put}_kernel_nofault: alpha, ia64, microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa. Add a generic version that lets everything use the normal copy_{from,to}_kernel_nofault() code based on these, removing the last use of get_fs()/set_fs() from architecture-independent code. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2021-09-08arch: remove compat_alloc_user_spaceArnd Bergmann
All users of compat_alloc_user_space() and copy_in_user() have been removed from the kernel, only a few functions in sparc remain that can be changed to calling arch_copy_in_user() instead. Link: https://lkml.kernel.org/r/20210727144859.4150043-7-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-03signal: Add unsafe_get_compat_sigset()Christophe Leroy
In the same way as commit 14026b94ccfe ("signal: Add unsafe_put_compat_sigset()"), this time add unsafe_get_compat_sigset() macro which is the 'unsafe' version of get_compat_sigset() For the bigendian, use unsafe_get_user() directly to avoid intermediate copy through the stack. For the littleendian, use a straight unsafe_copy_from_user(). This commit adds the generic fallback for unsafe_copy_from_user(). Architectures wanting to use unsafe_get_compat_sigset() have to make sure they have their own unsafe_copy_from_user(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b05bf434ee13c76bc9df5f02653a10db5e7b54e5.1616151715.git.christophe.leroy@csgroup.eu
2020-10-24Merge tag 'riscv-for-linus-5.10-mw1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: "Just a single patch set: the remainder of Christoph's work to remove set_fs, including the RISC-V portion" * tag 'riscv-for-linus-5.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: remove address space overrides using set_fs() riscv: implement __get_kernel_nofault and __put_user_nofault riscv: refactor __get_user and __put_user riscv: use memcpy based uaccess for nommu again asm-generic: make the set_fs implementation optional asm-generic: add nommu implementations of __{get,put}_kernel_nofault asm-generic: improve the nommu {get,put}_user handling uaccess: provide a generic TASK_SIZE_MAX definition
2020-10-22Merge branch 'work.set_fs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull initial set_fs() removal from Al Viro: "Christoph's set_fs base series + fixups" * 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Allow a NULL pos pointer to __kernel_read fs: Allow a NULL pos pointer to __kernel_write powerpc: remove address space overrides using set_fs() powerpc: use non-set_fs based maccess routines x86: remove address space overrides using set_fs() x86: make TASK_SIZE_MAX usable from assembly code x86: move PAGE_OFFSET, TASK_SIZE & friends to page_{32,64}_types.h lkdtm: remove set_fs-based tests test_bitmap: remove user bitmap tests uaccess: add infrastructure for kernel builds with set_fs() fs: don't allow splice read/write without explicit ops fs: don't allow kernel reads and writes without iter ops sysctl: Convert to iter interfaces proc: add a read_iter method to proc proc_ops proc: cleanup the compat vs no compat file ops proc: remove a level of indentation in proc_get_inode
2020-10-16lib, uaccess: add failure injection to usercopy functionsAlbert van der Linde
To test fault-tolerance of user memory access functions, introduce fault injection to usercopy functions. If a failure is expected return either -EFAULT or the total amount of bytes that were not copied. Signed-off-by: Albert van der Linde <alinde@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20200831171733.955393-3-alinde@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-16kernel.h: split out min()/max() et al. helpersAndy Shevchenko
kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out min()/max() et al. helpers. At the same time convert users in header and lib folder to use new header. Though for time being include new header back to kernel.h to avoid twisted indirected includes for other existing users. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Joe Perches <joe@perches.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20200910164152.GA1891694@smile.fi.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-06x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()Dan Williams
In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-04uaccess: provide a generic TASK_SIZE_MAX definitionChristoph Hellwig
Define TASK_SIZE_MAX as TASK_SIZE if not otherwise defined. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2020-09-08uaccess: add infrastructure for kernel builds with set_fs()Christoph Hellwig
Add a CONFIG_SET_FS option that is selected by architecturess that implement set_fs, which is all of them initially. If the option is not set stubs for routines related to overriding the address space are provided so that architectures can start to opt out of providing set_fs. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-08-12uaccess: add force_uaccess_{begin,end} helpersChristoph Hellwig
Add helpers to wrap the get_fs/set_fs magic for undoing any damange done by set_fs(KERNEL_DS). There is no real functional benefit, but this documents the intent of these calls better, and will allow stubbing the functions out easily for kernels builds that do not allow address space overrides in the future. [hch@lst.de: drop two incorrect hunks, fix a commit log typo] Link: http://lkml.kernel.org/r/20200714105505.935079-6-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Greentime Hu <green.hu@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Link: http://lkml.kernel.org/r/20200710135706.537715-6-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12uaccess: remove segment_eqChristoph Hellwig
segment_eq is only used to implement uaccess_kernel. Just open code uaccess_kernel in the arch uaccess headers and remove one layer of indirection. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Greentime Hu <green.hu@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Link: http://lkml.kernel.org/r/20200710135706.537715-5-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-18maccess: make get_kernel_nofault() check for minimal type compatibilityLinus Torvalds
Now that we've renamed probe_kernel_address() to get_kernel_nofault() and made it look and behave more in line with get_user(), some of the subtle type behavior differences end up being more obvious and possibly dangerous. When you do get_user(val, user_ptr); the type of the access comes from the "user_ptr" part, and the above basically acts as val = *user_ptr; by design (except, of course, for the fact that the actual dereference is done with a user access). Note how in the above case, the type of the end result comes from the pointer argument, and then the value is cast to the type of 'val' as part of the assignment. So the type of the pointer is ultimately the more important type both for the access itself. But 'get_kernel_nofault()' may now _look_ similar, but it behaves very differently. When you do get_kernel_nofault(val, kernel_ptr); it behaves like val = *(typeof(val) *)kernel_ptr; except, of course, for the fact that the actual dereference is done with exception handling so that a faulting access is suppressed and returned as the error code. But note how different the casting behavior of the two superficially similar accesses are: one does the actual access in the size of the type the pointer points to, while the other does the access in the size of the target, and ignores the pointer type entirely. Actually changing get_kernel_nofault() to act like get_user() is almost certainly the right thing to do eventually, but in the meantime this patch adds logit to at least verify that the pointer type is compatible with the type of the result. In many cases, this involves just casting the pointer to 'void *' to make it obvious that the type of the pointer is not the important part. It's not how 'get_user()' acts, but at least the behavioral difference is now obvious and explicit. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-18maccess: rename probe_kernel_address to get_kernel_nofaultChristoph Hellwig
Better describe what this helper does, and match the naming of copy_from_kernel_nofault. Also switch the argument order around, so that it acts and looks like get_user(). Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17maccess: rename probe_user_{read,write} to copy_{from,to}_user_nofaultChristoph Hellwig
Better describe what these functions do. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-17maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofaultChristoph Hellwig
Better describe what these functions do. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11Rebase locking/kcsan to locking/urgentThomas Gleixner
Merge the state of the locking kcsan branch before the read/write_once() and the atomics modifications got merged. Squash the fallout of the rebase on top of the read/write once and atomic fallback work into the merge. The history of the original branch is preserved in tag locking-kcsan-2020-06-02. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2020-06-09maccess: always use strict semantics for probe_kernel_readChristoph Hellwig
Except for historical confusion in the kprobes/uprobes and bpf tracers, which has been fixed now, there is no good reason to ever allow user memory accesses from probe_kernel_read. Switch probe_kernel_read to only read from kernel memory. [akpm@linux-foundation.org: update it for "mm, dump_page(): do not crash with invalid mapping pointer"] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-17-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09maccess: remove strncpy_from_unsafeChristoph Hellwig
All users are gone now. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-16-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09maccess: unify the probe kernel arch hooksChristoph Hellwig
Currently architectures have to override every routine that probes kernel memory, which includes a pure read and strcpy, both in strict and not strict variants. Just provide a single arch hooks instead to make sure all architectures cover all the cases. [akpm@linux-foundation.org: fix !CONFIG_X86_64 build] Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-11-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09maccess: rename strnlen_unsafe_user to strnlen_user_nofaultChristoph Hellwig
This matches the naming of strnlen_user, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-9-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09maccess: rename strncpy_from_unsafe_strict to strncpy_from_kernel_nofaultChristoph Hellwig
This matches the naming of strncpy_from_user_nofault, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-8-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09maccess: rename strncpy_from_unsafe_user to strncpy_from_user_nofaultChristoph Hellwig
This matches the naming of strncpy_from_user, and also makes it more clear what the function is supposed to do. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-7-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09maccess: remove duplicate kerneldoc commentsChristoph Hellwig
Many of the maccess routines have a copy of the kerneldoc comment in the header. Remove it as it is not useful and will get out of sync sooner or later. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-4-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09maccess: remove various unused weak aliasesChristoph Hellwig
maccess tends to define lots of underscore prefixed symbols that then have other weak aliases. But except for two cases they are never actually used, so remove them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200521152301.2587579-3-hch@lst.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-05-01uaccess: Add user_read_access_begin/end and user_write_access_begin/endChristophe Leroy
Some architectures like powerpc64 have the capability to separate read access and write access protection. For get_user() and copy_from_user(), powerpc64 only open read access. For put_user() and copy_to_user(), powerpc64 only open write access. But when using unsafe_get_user() or unsafe_put_user(), user_access_begin open both read and write. Other architectures like powerpc book3s 32 bits only allow write access protection. And on this architecture protection is an heavy operation as it requires locking/unlocking per segment of 256Mbytes. On those architecture it is therefore desirable to do the unlocking only for write access. (Note that book3s/32 ranges from very old powermac from the 90's with powerpc 601 processor, till modern ADSL boxes with PowerQuicc II processors for instance so it is still worth considering.) In order to avoid any risk based of hacking some variable parameters passed to user_access_begin/end that would allow hacking and leaving user access open or opening too much, it is preferable to use dedicated static functions that can't be overridden. Add a user_read_access_begin and user_read_access_end to only open read access. Add a user_write_access_begin and user_write_access_end to only open write access. By default, when undefined, those new access helpers default on the existing user_access_begin and user_access_end. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/36e43241c7f043a24b5069e78c6a7edd11043be5.1585898438.git.christophe.leroy@c-s.fr
2020-03-21copy_to_user, copy_from_user: Use generic instrumented.hMarco Elver
This replaces the KASAN instrumentation with generic instrumentation, implicitly adding KCSAN instrumentation support. For KASAN no functional change is intended. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>