path: root/mm/nommu.c
diff options
authorJoerg Roedel <>2020-03-21 18:22:41 -0700
committerLinus Torvalds <>2020-03-21 18:56:06 -0700
commit763802b53a427ed3cbd419dbba255c414fdd9e7c (patch)
treeb6dc4f611e81120ace08263b06a014bda53f0f53 /mm/nommu.c
parent0715e6c516f106ed553828a671d30ad9a3431536 (diff)
x86/mm: split vmalloc_sync_all()
Commit 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") introduced a call to vmalloc_sync_all() in the vunmap() code-path. While this change was necessary to maintain correctness on x86-32-pae kernels, it also adds additional cycles for architectures that don't need it. Specifically on x86-64 with CONFIG_VMAP_STACK=y some people reported severe performance regressions in micro-benchmarks because it now also calls the x86-64 implementation of vmalloc_sync_all() on vunmap(). But the vmalloc_sync_all() implementation on x86-64 is only needed for newly created mappings. To avoid the unnecessary work on x86-64 and to gain the performance back, split up vmalloc_sync_all() into two functions: * vmalloc_sync_mappings(), and * vmalloc_sync_unmappings() Most call-sites to vmalloc_sync_all() only care about new mappings being synchronized. The only exception is the new call-site added in the above mentioned commit. Shile Zhang directed us to a report of an 80% regression in reaim throughput. Fixes: 3f8fd02b1bf1 ("mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()") Reported-by: kernel test robot <> Reported-by: Shile Zhang <> Signed-off-by: Joerg Roedel <> Signed-off-by: Andrew Morton <> Tested-by: Borislav Petkov <> Acked-by: Rafael J. Wysocki <> [GHES] Cc: Dave Hansen <> Cc: Andy Lutomirski <> Cc: Peter Zijlstra <> Cc: Thomas Gleixner <> Cc: Ingo Molnar <> Cc: <> Link: Link: Link: Signed-off-by: Linus Torvalds <>
Diffstat (limited to 'mm/nommu.c')
1 files changed, 7 insertions, 3 deletions
diff --git a/mm/nommu.c b/mm/nommu.c
index bd2b4e5ef144..318df4e236c9 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -370,10 +370,14 @@ void vm_unmap_aliases(void)
- * Implement a stub for vmalloc_sync_all() if the architecture chose not to
- * have one.
+ * Implement a stub for vmalloc_sync_[un]mapping() if the architecture
+ * chose not to have one.
-void __weak vmalloc_sync_all(void)
+void __weak vmalloc_sync_mappings(void)
+void __weak vmalloc_sync_unmappings(void)