linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-04-01	mseal sysmap: generic vdso vvar mapping	Heiko Carstens
	With the introduction of the generic vdso data storage the VM_SEALED_SYSMAP vm flag must be moved from the architecture specific _install_special_mapping() call [1] [2] which maps the vvar mapping to generic code. [1] https://lkml.kernel.org/r/20250305021711.3867874-4-jeffxu@google.com [2] https://lkml.kernel.org/r/20250305021711.3867874-5-jeffxu@google.com Link: https://lkml.kernel.org/r/20250311123326.2686682-2-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-08	vdso: Rework struct vdso_time_data and introduce struct vdso_clock	Anna-Maria Behnsen
	To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be an array of VDSO clocks. Now that all preparatory changes are in place: Split the clock related struct members into a separate struct vdso_clock. Make sure all users are aware, that vdso_time_data is no longer initialized as an array and vdso_clock is now the array inside vdso_data. Remove the vdso_clock define, which mapped it to vdso_time_data for the transition. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-19-c1b5c69a166f@linutronix.de
2025-03-08	vdso/namespace: Rename timens_setup_vdso_data() to reflect new vdso_clock struct	Anna-Maria Behnsen
	To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be array of VDSO clocks. At the moment, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. For time namespaces, vdso_time_data needs to be set up. But only the clock related part of the vdso_data thats requires this setup. To reflect the future struct vdso_clock, rename timens_setup_vdso_data() to timns_setup_vdso_clock_data(). No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-13-c1b5c69a166f@linutronix.de
2025-03-08	vdso/gettimeofday: Prepare helper functions for introduction of struct ↵	Anna-Maria Behnsen
	vdso_clock To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be array of VDSO clocks. At the moment, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. To prepare for the rework of the data structures, replace the struct vdso_time_data pointer argument of the helper functions with struct vdso_clock pointer where applicable. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-11-c1b5c69a166f@linutronix.de
2025-03-08	vdso/gettimeofday: Prepare do_coarse_timens() for introduction of struct ↵	Anna-Maria Behnsen
	vdso_clock To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be array of VDSO clocks. At the moment, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. Prepare for the rework of these structures by adding a struct vdso_clock pointer argument to do_coarse_time_ns(), and replace the struct vdso_time_data pointer with the new pointer argument where applicable. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-10-c1b5c69a166f@linutronix.de
2025-03-08	vdso/gettimeofday: Prepare do_coarse() for introduction of struct vdso_clock	Anna-Maria Behnsen
	To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be array of VDSO clocks. At the moment, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. Prepare for the rework of these structures by adding a struct vdso_clock pointer argument to do_coarse(), and replace the struct vdso_time_data pointer with the new pointer argument where applicable. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-9-c1b5c69a166f@linutronix.de
2025-03-08	vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct ↵	Anna-Maria Behnsen
	vdso_clock To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be array of VDSO clocks. At the moment, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. Prepare for the rework of these structures by adding a struct vdso_clock pointer argument to do_hres_timens(), and replace the struct vdso_time_data pointer with the new pointer argument where applicable. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-8-c1b5c69a166f@linutronix.de
2025-03-08	vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock	Anna-Maria Behnsen
	To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be array of VDSO clocks. At the moment, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. Prepare for the rework of these structures by adding a struct vdso_clock pointer argument to do_hres(), and replace the struct vdso_time_data pointer with the new pointer argument where applicable. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-7-c1b5c69a166f@linutronix.de
2025-03-08	vdso/gettimeofday: Prepare introduction of struct vdso_clock	Anna-Maria Behnsen
	To support multiple PTP clocks, the VDSO data structure needs to be reworked. All clock specific data will end up in struct vdso_clock and in struct vdso_time_data there will be array of VDSO clocks. At the moment, vdso_clock is simply a define which maps vdso_clock to vdso_time_data. Prepare all functions which need the pointer to the vdso_clock array to work correctly after introducing the new struct. Where applicable, replace the struct vdso_time_data pointer by a struct vdso_clock pointer. No functional change. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303-vdso-clock-v1-6-c1b5c69a166f@linutronix.de
2025-02-21	vdso: Remove remnants of architecture-specific time storage	Thomas Weißschuh
	All users of the time releated parts of the vDSO are now using the generic storage implementation. Remove the therefore unnecessary compatibility accessor functions and symbols. Co-developed-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-18-13a4669dfc8c@linutronix.de
2025-02-21	vdso: Add generic architecture-specific data storage	Thomas Weißschuh
	Some architectures need to expose architecture-specific data to the vDSO. Enable the generic vDSO storage mechanism to both store and map this data. Some architectures require more than a single page, like LoongArch, so prepare for that usecase, too. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-7-13a4669dfc8c@linutronix.de
2025-02-21	vdso: Add generic random data storage	Thomas Weißschuh
	Extend the generic vDSO data storage with a page for the random state data. The random state data is stored in a dedicated page, as the existing storage page is only meant for time-related, time-namespace-aware data. This simplifies to access logic to not need to handle time namespaces anymore and also frees up more space in the time-related page. In case further generic vDSO data store is required it can be added to the random state page. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-6-13a4669dfc8c@linutronix.de
2025-02-21	vdso: Add generic time data storage	Thomas Weißschuh
	Historically each architecture defined their own way to store the vDSO data page. Add a generic mechanism to provide storage for that page. Furthermore this generic storage will be extended to also provide uniform storage for non-time-related data, like the random state or architecture-specific data. These will have their own pages and data structures, so rename 'vdso_data' into 'vdso_time_data' to make that split clear from the name. Also introduce a new consistent naming scheme for the symbols related to the vDSO, which makes it clear if the symbol is accessible from userspace or kernel space and the type of data behind the symbol. The generic fault handler contains an optimization to prefault the vvar page when the timens page is accessed. This was lifted from s390 and x86. Co-developed-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-5-13a4669dfc8c@linutronix.de
2025-02-21	vdso: Rename included Makefile	Thomas Weißschuh
	As the Makefile is included into other Makefiles it can not be used to define objects to be built from the current source directory. However the generic datastore will introduce such a local source file. Rename the included Makefile so it is clear how it is to be used and to make room for a regular Makefile in lib/vdso/. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250204-vdso-store-rng-v3-4-13a4669dfc8c@linutronix.de
2024-09-13	random: vDSO: minimize and simplify header includes	Christophe Leroy
	Depending on the architecture, building a 32-bit vDSO on a 64-bit kernel is problematic when some system headers are included. Minimise the amount of headers by moving needed items, such as __{get,put}_unaligned_t, into dedicated common headers and in general use more specific headers, similar to what was done in commit 8165b57bca21 ("linux/const.h: Extract common header for vDSO") and commit 8c59ab839f52 ("lib/vdso: Enable common headers"). On some architectures this results in missing PAGE_SIZE, as was described by commit 8b3843ae3634 ("vdso/datapage: Quick fix - use asm/page-def.h for ARM64"), so define this if necessary, in the same way as done prior by commit cffaefd15a8f ("vdso: Use CONFIG_PAGE_SHIFT in vdso/datapage.h"). Removing linux/time64.h leads to missing 'struct timespec64' in x86's asm/pvclock.h. Add a forward declaration of that struct in that file. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	random: vDSO: avoid call to out of line memset()	Christophe Leroy
	With the current implementation, __cvdso_getrandom_data() calls memset() on certain architectures, which is unexpected in the VDSO. Rather than providing a memset(), simply rewrite opaque data initialization to avoid memset(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	random: vDSO: add missing c-getrandom-y in Makefile	Christophe Leroy
	Same as for the gettimeofday CVDSO implementation, add c-getrandom-y to ease the inclusion of lib/vdso/getrandom.c in architectures' VDSO builds. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-09-13	random: vDSO: don't use 64-bit atomics on 32-bit architectures	Christophe Leroy
	Performing SMP atomic operations on u64 fails on powerpc32: CC drivers/char/random.o In file included from <command-line>: drivers/char/random.c: In function 'crng_reseed': ././include/linux/compiler_types.h:510:45: error: call to '__compiletime_assert_391' declared with attribute error: Need native word sized stores/loads for atomicity. 510 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| ^ ././include/linux/compiler_types.h:491:25: note: in definition of macro '__compiletime_assert' 491 \| prefix ## suffix(); \ \| ^~~~~~ ././include/linux/compiler_types.h:510:9: note: in expansion of macro '_compiletime_assert' 510 \| _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) \| ^~~~~~~~~~~~~~~~~~~ ././include/linux/compiler_types.h:513:9: note: in expansion of macro 'compiletime_assert' 513 \| compiletime_assert(__native_word(t), \ \| ^~~~~~~~~~~~~~~~~~ ./arch/powerpc/include/asm/barrier.h:74:9: note: in expansion of macro 'compiletime_assert_atomic_type' 74 \| compiletime_assert_atomic_type(*p); \ \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/asm-generic/barrier.h:172:55: note: in expansion of macro '__smp_store_release' 172 \| #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0) \| ^~~~~~~~~~~~~~~~~~~ drivers/char/random.c:286:9: note: in expansion of macro 'smp_store_release' 286 \| smp_store_release(&__arch_get_k_vdso_rng_data()->generation, next_gen + 1); \| ^~~~~~~~~~~~~~~~~ The kernel-side generation counter in the random driver is handled as an unsigned long, not as a u64, in base_crng and struct crng. But on the vDSO side, it needs to be an u64, not just an unsigned long, in order to support a 32-bit vDSO atop a 64-bit kernel. On kernel side, however, it is an unsigned long, hence a 32-bit value on 32-bit architectures, so just cast it to unsigned long for the smp_store_release(). A side effect is that on big endian architectures the store will be performed in the upper 32 bits. It is not an issue on its own because the vDSO site doesn't mind the value, as it only checks differences. Just make sure that the vDSO side checks the full 64 bits. For that, the local current_generation has to be u64 as well. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-08-26	random: vDSO: reject unknown getrandom() flags	Yann Droneaud
	Like the getrandom() syscall, vDSO getrandom() must also reject unknown flags. [1] It would be possible to return -EINVAL from vDSO itself, but in the possible case that a new flag is added to getrandom() syscall in the future, it would be easier to get the behavior from the syscall, instead of erroring until the vDSO is extended to support the new flag or explicitly falling back. [1] Designing the API: Planning for Extension https://docs.kernel.org/process/adding-syscalls.html#designing-the-api-planning-for-extension Signed-off-by: Yann Droneaud <yann@droneaud.fr> [Jason: reworded commit message] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-07-24	Merge tag 'random-6.11-rc1-for-linus' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "This adds getrandom() support to the vDSO. First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which lets the kernel zero out pages anytime under memory pressure, which enables allocating memory that never gets swapped to disk but also doesn't count as being mlocked. Then, the vDSO implementation of getrandom() is introduced in a generic manner and hooked into random.c. Next, this is implemented on x86. (Also, though it's not ready for this pull, somebody has begun an arm64 implementation already) Finally, two vDSO selftests are added. There are also two housekeeping cleanup commits" * tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: MAINTAINERS: add random.h headers to RNG subsection random: note that RNDGETPOOL was removed in 2.6.9-rc2 selftests/vDSO: add tests for vgetrandom x86: vdso: Wire up getrandom() vDSO implementation random: introduce generic vDSO getrandom() implementation mm: add MAP_DROPPABLE for designating always lazily freeable mappings
2024-07-19	random: introduce generic vDSO getrandom() implementation	Jason A. Donenfeld
	Provide a generic C vDSO getrandom() implementation, which operates on an opaque state returned by vgetrandom_alloc() and produces random bytes the same way as getrandom(). This has the following API signature: ssize_t vgetrandom(void buffer, size_t len, unsigned int flags, void opaque_state, size_t opaque_len); The return value and the first three arguments are the same as ordinary getrandom(), while the last two arguments are a pointer to the opaque allocated state and its size. Were all five arguments passed to the getrandom() syscall, nothing different would happen, and the functions would have the exact same behavior. The actual vDSO RNG algorithm implemented is the same one implemented by drivers/char/random.c, using the same fast-erasure techniques as that. Should the in-kernel implementation change, so too will the vDSO one. It requires an implementation of ChaCha20 that does not use any stack, in order to maintain forward secrecy if a multi-threaded program forks (though this does not account for a similar issue with SA_SIGINFO copying registers to the stack), so this is left as an architecture-specific fill-in. Stack-less ChaCha20 is an easy algorithm to implement on a variety of architectures, so this shouldn't be too onerous. Initially, the state is keyless, and so the first call makes a getrandom() syscall to generate that key, and then uses it for subsequent calls. By keeping track of a generation counter, it knows when its key is invalidated and it should fetch a new one using the syscall. Later, more than just a generation counter might be used. Since MADV_WIPEONFORK is set on the opaque state, the key and related state is wiped during a fork(), so secrets don't roll over into new processes, and the same state doesn't accidentally generate the same random stream. The generation counter, as well, is always >0, so that the 0 counter is a useful indication of a fork() or otherwise uninitialized state. If the kernel RNG is not yet initialized, then the vDSO always calls the syscall, because that behavior cannot be emulated in userspace, but fortunately that state is short lived and only during early boot. If it has been initialized, then there is no need to inspect the `flags` argument, because the behavior does not change post-initialization regardless of the `flags` value. Since the opaque state passed to it is mutated, vDSO getrandom() is not reentrant, when used with the same opaque state, which libc should be mindful of. The function works over an opaque per-thread state of a particular size, which must be marked VM_WIPEONFORK, VM_DONTDUMP, VM_NORESERVE, and VM_DROPPABLE for proper operation. Over time, the nuances of these allocations may change or grow or even differ based on architectural features. The opaque state passed to vDSO getrandom() must be allocated using the mmap_flags and mmap_prot parameters provided by the vgetrandom_opaque_params struct, which also contains the size of each state. That struct can be obtained with a call to vgetrandom(NULL, 0, 0, &params, ~0UL). Then, libc can call mmap(2) and slice up the returned array into a state per each thread, while ensuring that no single state straddles a page boundary. Libc is expected to allocate a chunk of these on first use, and then dole them out to threads as they're created, allocating more when needed. vDSO getrandom() provides the ability for userspace to generate random bytes quickly and safely, and is intended to be integrated into libc's thread management. As an illustrative example, the introduced code in the vdso_test_getrandom self test later in this series might be used to do the same outside of libc. In a libc the various pthread-isms are expected to be elided into libc internals. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2024-07-03	vdso/gettimeofday: Clarify comment about open coded function	Anna-Maria Behnsen
	The two comments state, that the following code open codes something but they lack to specify what exactly is open coded. Expand comments by mentioning the reference to the open coded function. Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20240701-vdso-cleanup-v1-1-36eb64e7ece2@linutronix.de
2024-04-09	vdso: Fix powerpc build U64_MAX undeclared error	Adrian Hunter
	U64_MAX is not in include/vdso/limits.h, although that isn't noticed on x86 because x86 includes include/linux/limits.h indirectly. However powerpc is more selective, resulting in the following build error: In file included from <command-line>: lib/vdso/gettimeofday.c: In function 'vdso_calc_ns': lib/vdso/gettimeofday.c:11:33: error: 'U64_MAX' undeclared 11 \| # define VDSO_DELTA_MASK(vd) U64_MAX \| ^~~~~~~ Use ULLONG_MAX instead which will work just as well and is in include/vdso/limits.h. Fixes: c8e3a8b6f2e6 ("vdso: Consolidate vdso_calc_delta()") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240409062639.3393-1-adrian.hunter@intel.com Closes: https://lore.kernel.org/all/20240409124905.6816db37@canb.auug.org.au/
2024-04-08	vdso: Make delta calculation overflow safe	Adrian Hunter
	Kernel timekeeping is designed to keep the change in cycles (since the last timer interrupt) below max_cycles, which prevents multiplication overflow when converting cycles to nanoseconds. However, if timer interrupts stop, the calculation will eventually overflow. Add protection against that, enabled by config option CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT. Check against max_cycles, falling back to a slower higher precision calculation. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240325064023.2997-8-adrian.hunter@intel.com
2024-04-08	vdso: Add CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT	Adrian Hunter
	Add CONFIG_GENERIC_VDSO_OVERFLOW_PROTECT in preparation to add multiplication overflow protection to the VDSO time getter functions. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240325064023.2997-4-adrian.hunter@intel.com
2024-04-08	vdso: Consolidate nanoseconds calculation	Adrian Hunter
	Consolidate nanoseconds calculation to simplify and reduce code duplication. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240325064023.2997-3-adrian.hunter@intel.com
2024-04-08	vdso: Consolidate vdso_calc_delta()	Adrian Hunter
	Consolidate vdso_calc_delta(), in preparation for further simplification. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20240325064023.2997-2-adrian.hunter@intel.com
2023-03-21	vdso: Improve cmd_vdso_check to check all dynamic relocations	Fangrui Song
	The actual intention is that no dynamic relocation exists in the VDSO. For this the VDSO build validates that the resulting .so file does not have any relocations which are specified via $(ARCH_REL_TYPE_ABS) per architecture, which is fragile as e.g. ARM64 lacks an entry for R_AARCH64_RELATIVE. Aside of that ARCH_REL_TYPE_ABS is a misnomer as it checks for relative relocations too. However, some GNU ld ports produce unneeded R__NONE relocation entries. If a port fails to determine the exact .rel[a].dyn size, the trailing zeros become R__NONE relocations. E.g. ld's powerpc port recently fixed https://sourceware.org/bugzilla/show_bug.cgi?id=29540). R__NONE are generally a no-op in the dynamic loaders. So just ignore them. Remove the ARCH_REL_TYPE_ABS defines and just validate that the resulting .so file does not contain any R_ relocation entries except R_*_NONE. Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for aarch64 Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for vDSO, aarch64 Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Link: https://lore.kernel.org/r/20230310190750.3323802-1-maskray@google.com
2022-11-23	lib/vdso: use "grep -E" instead of "egrep"	Greg Kroah-Hartman
	The latest version of grep claims the egrep is now obsolete so the build now contains warnings that look like: egrep: warning: egrep is obsolescent; using grep -E fix this up by moving the vdso Makefile to use "grep -E" instead. Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20220920170633.3133829-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-14	lib/vdso: Add vdso_data pointer as input to __arch_get_timens_vdso_data()	Christophe Leroy
	For the same reason as commit e876f0b69dc9 ("lib/vdso: Allow architectures to provide the vdso data pointer"), powerpc wants to avoid calculation of relative position to code. As the timens_vdso_data is next page to vdso_data, provide vdso_data pointer to __arch_get_timens_vdso_data() in order to ease the calculation on powerpc in following patches. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/539c4204b1baa77c55f758904a1ea239abbc7a5c.1617209142.git.christophe.leroy@csgroup.eu
2021-04-14	lib/vdso: Mark do_hres_timens() and do_coarse_timens() __always_inline()	Christophe Leroy
	In the same spirit as commit c966533f8c6c ("lib/vdso: Mark do_hres() and do_coarse() as __always_inline"), mark do_hres_timens() and do_coarse_timens() __always_inline. The measurement below in on a non timens process, ie on the fastest path. On powerpc32, without the patch: clock-gettime-monotonic-raw: vdso: 1155 nsec/call clock-gettime-monotonic-coarse: vdso: 813 nsec/call clock-gettime-monotonic: vdso: 1076 nsec/call With the patch: clock-gettime-monotonic-raw: vdso: 1100 nsec/call clock-gettime-monotonic-coarse: vdso: 667 nsec/call clock-gettime-monotonic: vdso: 1025 nsec/call Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/90dcf45ebadfd5a07f24241551c62f619d1cb930.1617209142.git.christophe.leroy@csgroup.eu
2020-08-06	vdso/treewide: Add vdso_data pointer argument to __arch_get_hw_counter()	Thomas Gleixner
	MIPS already uses and S390 will need the vdso data pointer in __arch_get_hw_counter(). This works nicely as long as the architecture does not support time namespaces in the VDSO. With time namespaces enabled the regular accessor to the vdso data pointer __arch_get_vdso_data() will return the namespace specific VDSO data page for tasks which are part of a non-root time namespace. This would cause the architectures which need the vdso data pointer in __arch_get_hw_counter() to access the wrong vdso data page. Add a vdso_data pointer argument to __arch_get_hw_counter() and hand it in from the call sites in the core code. For architectures which do not need the data pointer in their counter accessor function the compiler will just optimize it out. Fix up all existing architecture implementations and make MIPS utilize the pointer instead of invoking the accessor function. No functional change and no change in the resulting object code (except MIPS). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/draft-87wo2ekuzn.fsf@nanos.tec.linutronix.de
2020-06-11	Merge tag 'x86-urgent-2020-06-11' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 updates from Thomas Gleixner: "A set of fixes and updates for x86: - Unbreak paravirt VDSO clocks. While the VDSO code was moved into lib for sharing a subtle check for the validity of paravirt clocks got replaced. While the replacement works perfectly fine for bare metal as the update of the VDSO clock mode is synchronous, it fails for paravirt clocks because the hypervisor can invalidate them asynchronously. Bring it back as an optional function so it does not inflict this on architectures which are free of PV damage. - Fix the jiffies to jiffies64 mapping on 64bit so it does not trigger an ODR violation on newer compilers - Three fixes for the SSBD and IB speculation mitigation maze to ensure consistency, not disabling of some IB variants wrongly and to prevent a rogue cross process shutdown of SSBD. All marked for stable. - Add yet more CPU models to the splitlock detection capable list !@#%$! - Bring the pr_info() back which tells that TSC deadline timer is enabled. - Reboot quirk for MacBook6,1" * tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Unbreak paravirt VDSO clocks lib/vdso: Provide sanity check for cycles (again) clocksource: Remove obsolete ifdef x86_64: Fix jiffies ODR violation x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches. x86/speculation: Prevent rogue cross-process SSBD shutdown x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS. x86/cpu: Add Sapphire Rapids CPU model number x86/split_lock: Add Icelake microserver and Tigerlake CPU models x86/apic: Make TSC deadline timer detection message visible x86/reboot/quirks: Add MacBook6,1 reboot quirk
2020-06-09	lib/vdso: Provide sanity check for cycles (again)	Thomas Gleixner
	The original x86 VDSO implementation checked for the validity of the clock source read by testing whether the returned signed cycles value is less than zero. This check was also used by the vdso read function to signal that the current selected clocksource is not VDSO capable. During the rework of the VDSO code the check was removed and replaced with a check for the clocksource mode being != NONE. This turned out to be a mistake because the check is necessary for paravirt and hyperv clock sources. The reason is that these clock sources have their own internal sequence counter to validate the clocksource at the point of reading it. This is necessary because the hypervisor can invalidate the clocksource asynchronously so a check during the VDSO data update is not sufficient. Having a separate indicator for the validity is slower than just validating the cycles value. The check for it being negative turned out to be the fastest implementation and safe as it would require an uptime of ~73 years with a 4GHz counter frequency to result in a false positive. Add an optional function to validate the cycles with a default implementation which allows the compiler to optimize it out for architectures which do not require it. Fixes: 5d51bee725cc ("clocksource: Add common vdso clock mode storage") Reported-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Miklos Szeredi <mszeredi@redhat.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200606221531.963970768@linutronix.de
2020-06-03	lib/vdso: Force inlining of __cvdso_clock_gettime_common()	Christophe Leroy
	When adding gettime64() to a 32 bit architecture (namely powerpc/32) it has been noticed that GCC doesn't inline anymore __cvdso_clock_gettime_common() because it is called twice (Once by __cvdso_clock_gettime() and once by __cvdso_clock_gettime32). This has the effect of seriously degrading the performance: Before the implementation of gettime64(), gettime() runs in: clock-gettime-monotonic-raw: vdso: 1003 nsec/call clock-gettime-monotonic-coarse: vdso: 592 nsec/call clock-gettime-monotonic: vdso: 942 nsec/call When adding a gettime64() entry point, the standard gettime() performance is degraded by 30% to 50%: clock-gettime-monotonic-raw: vdso: 1300 nsec/call clock-gettime-monotonic-coarse: vdso: 900 nsec/call clock-gettime-monotonic: vdso: 1232 nsec/call Adding __always_inline() to __cvdso_clock_gettime_common() regains the original performance. In terms of code size, the inlining increases the code size by only 176 bytes. This is in the noise for a kernel image. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/1ab6a62c356c3bec35d1623563ef9c636205bcda.1588079622.git.christophe.leroy@c-s.fr
2020-03-21	lib/vdso: Enable common headers	Vincenzo Frascino
	The vDSO library should only include the necessary headers required for a userspace library (UAPI and a minimal set of kernel headers). To make this possible it is necessary to isolate from the kernel headers the common parts that are strictly necessary to build the library. Refactor the unified vdso code to use the common headers. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20200320145351.32292-26-vincenzo.frascino@arm.com
2020-02-17	lib/vdso: Allow architectures to provide the vdso data pointer	Christophe Leroy
	On powerpc, __arch_get_vdso_data() clobbers the link register, requiring the caller to save it. As the parent function already has to set a stack frame and saves the link register before calling the C vdso function, retrieving the vdso data pointer there is less overhead. Split out the functional code from the __cvdso.*() interfaces into new static functions which can either be called from the existing interfaces with the vdso data pointer supplied via __arch_get_vdso_data() or directly from ASM code. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/abf97996602ef07223fec30c005df78e5ed41b2e.1580399657.git.christophe.leroy@c-s.fr Link: https://lkml.kernel.org/r/20200207124403.965789141@linutronix.de
2020-02-17	lib/vdso: Allow architectures to override the ns shift operation	Christophe Leroy
	On powerpc/32, GCC (8.1) generates pretty bad code for the ns >>= vd->shift operation taking into account that the shift is always <= 32 and the upper part of the result is likely to be zero. GCC makes reversed assumptions considering the shift to be likely >= 32 and the upper part to be like not zero. unsigned long long shift(unsigned long long x, unsigned char s) { return x >> s; } results in: 00000018 <shift>: 18: 35 25 ff e0 addic. r9,r5,-32 1c: 41 80 00 10 blt 2c <shift+0x14> 20: 7c 64 4c 30 srw r4,r3,r9 24: 38 60 00 00 li r3,0 28: 4e 80 00 20 blr 2c: 54 69 08 3c rlwinm r9,r3,1,0,30 30: 21 45 00 1f subfic r10,r5,31 34: 7c 84 2c 30 srw r4,r4,r5 38: 7d 29 50 30 slw r9,r9,r10 3c: 7c 63 2c 30 srw r3,r3,r5 40: 7d 24 23 78 or r4,r9,r4 44: 4e 80 00 20 blr Even when forcing the shift to be smaller than 32 with an &= 31, it still considers the shift as likely >= 32. Move the default shift implementation into an inline which can be redefined in architecture code via a macro. [ tglx: Made the shift argument u32 and removed the __arch prefix ] Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/b3d449de856982ed060a71e6ace8eeca4654e685.1580399657.git.christophe.leroy@c-s.fr Link: https://lkml.kernel.org/r/20200207124403.857649978@linutronix.de
2020-02-17	lib/vdso: Allow fixed clock mode	Christophe Leroy
	Some architectures have a fixed clocksource which is known at compile time and cannot be replaced or disabled at runtime, e.g. timebase on PowerPC. For such cases the clock mode check in the VDSO code is pointless. Move the check for a VDSO capable clocksource into an inline function and allow architectures to redefine it via a macro. [ tglx: Removed the #ifdef mess ] Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124403.748756829@linutronix.de
2020-02-17	lib/vdso: Move VCLOCK_TIMENS to vdso_clock_modes	Thomas Gleixner
	Move the time namespace indicator clock mode to the other ones for consistency sake. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124403.656097274@linutronix.de
2020-02-17	lib/vdso: Cleanup clock mode storage leftovers	Thomas Gleixner
	Now that all architectures are converted to use the generic storage the helpers and conditionals can be removed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124403.470699892@linutronix.de
2020-02-17	clocksource: Add common vdso clock mode storage	Thomas Gleixner
	All architectures which use the generic VDSO code have their own storage for the VDSO clock mode. That's pointless and just requires duplicate code. Provide generic storage for it. The new Kconfig symbol is intermediate and will be removed once all architectures are converted over. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124403.028046322@linutronix.de
2020-02-17	lib/vdso: Allow the high resolution parts to be compiled out	Thomas Gleixner
	If the architecture knows at compile time that there is no VDSO capable clocksource supported it makes sense to optimize the guts of the high resolution parts of the VDSO out at build time. Add a helper function to check whether the VDSO should be high resolution capable and provide a stub which can be overridden by an architecture. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lkml.kernel.org/r/20200207124402.530143168@linutronix.de
2020-01-16	lib/vdso: Only read hrtimer_res when needed in __cvdso_clock_getres()	Christophe Leroy
	Only perform READ_ONCE(vd[CS_HRES_COARSE].hrtimer_res) for HRES and RAW clocks. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/7ac2f0d21652f95e2bbdfa6bd514ae6c7caf53ab.1579196675.git.christophe.leroy@c-s.fr
2020-01-14	lib/vdso: Prepare for time namespace support	Thomas Gleixner
	To support time namespaces in the vdso with a minimal impact on regular non time namespace affected tasks, the namespace handling needs to be hidden in a slow path. The most obvious place is vdso_seq_begin(). If a task belongs to a time namespace then the VVAR page which contains the system wide vdso data is replaced with a namespace specific page which has the same layout as the VVAR page. That page has vdso_data->seq set to 1 to enforce the slow path and vdso_data->clock_mode set to VCLOCK_TIMENS to enforce the time namespace handling path. The extra check in the case that vdso_data->seq is odd, e.g. a concurrent update of the vdso data is in progress, is not really affecting regular tasks which are not part of a time namespace as the task is spin waiting for the update to finish and vdso_data->seq to become even again. If a time namespace task hits that code path, it invokes the corresponding time getter function which retrieves the real VVAR page, reads host time and then adds the offset for the requested clock which is stored in the special VVAR page. If VDSO time namespace support is disabled the whole magic is compiled out. Initial testing shows that the disabled case is almost identical to the host case which does not take the slow timens path. With the special timens page installed the performance hit is constant time and in the range of 5-7%. For the vdso functions which are not using the sequence count an unconditional check for vdso_data->clock_mode is added which switches to the real vdso when the clock_mode is VCLOCK_TIMENS. [avagin: Make do_hres_timens() work with raw clocks too: choose vdso_data pointer by CS_RAW offset.] Suggested-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-21-dima@arista.com
2020-01-14	lib/vdso: Mark do_hres() and do_coarse() as __always_inline	Andrei Vagin
	Performance numbers for Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz (more clock_gettime() cycles - the better): clock \| before \| after \| diff ---------------------------------------------------------- monotonic \| 153222105 \| 166775025 \| 8.8% monotonic-coarse \| 671557054 \| 691513017 \| 3.0% monotonic-raw \| 147116067 \| 161057395 \| 9.5% boottime \| 153446224 \| 166962668 \| 9.1% The improvement for arm64 for monotonic and boottime is around 3.5%. clock \| before \| after \| diff ================================================== monotonic 17326692 17951770 3.6% monotonic-coarse 43624027 44215292 1.3% monotonic-raw 17541809 17554932 0.1% boottime 17334982 17954361 3.5% [ tglx: Avoid the goto ] Signed-off-by: Andrei Vagin <avagin@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191112012724.250792-3-dima@arista.com
2020-01-14	lib/vdso: Avoid duplication in __cvdso_clock_getres()	Christophe Leroy
	VDSO_HRES and VDSO_RAW clocks are handled the same way. Avoid the code duplication. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/r/fdf1a968a8f7edd61456f1689ac44082ebb19c15.1577111367.git.christophe.leroy@c-s.fr
2020-01-14	lib/vdso: Let do_coarse() return 0 to simplify the callsite	Christophe Leroy
	do_coarse() is similar to do_hres() except that it never fails. Change its type to int instead of void and let it always return success (0) to simplify the call site. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/21e8afa38c02ca8672c2690307383507fe63b454.1577111367.git.christophe.leroy@c-s.fr
2020-01-14	lib/vdso: Remove checks on return value for 32 bit vDSO	Vincenzo Frascino
	Since all the architectures that support the generic vDSO library have been converted to support the 32 bit fallbacks it is not required anymore to check the return value of __cvdso_clock_get*time32_common() before updating the old_timespec fields. Remove the related checks from the generic vdso library. References: c60a32ea4f45 ("lib/vdso/32: Provide legacy syscall fallbacks") Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20190830135902.20861-6-vincenzo.frascino@arm.com
2020-01-14	lib/vdso: Remove VDSO_HAS_32BIT_FALLBACK	Vincenzo Frascino
	VDSO_HAS_32BIT_FALLBACK was introduced to address a regression which caused seccomp to deny access to the applications to clock_gettime64() and clock_getres64() because they are not enabled in the existing filters. The purpose of VDSO_HAS_32BIT_FALLBACK was to simplify the conditional implementation of __cvdso_clock_get*time32() variants. Now that all the architectures that support the generic vDSO library have been converted to support the 32 bit fallbacks the conditional can be removed. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20190830135902.20861-5-vincenzo.frascino@arm.com References: c60a32ea4f45 ("lib/vdso/32: Provide legacy syscall fallbacks")