path: root/mm
diff options
authorJoonsoo Kim <>2016-03-17 14:19:29 -0700
committerLinus Torvalds <>2016-03-17 15:09:34 -0700
commit95813b8faa0cd315f61a8b9d9c523792370b693e (patch)
treecc60847b726877919d3da68f7af8b32f8a1d8d2a /mm
parentfe896d1878949ea92ba547587bc3075cc688fb8f (diff)
mm/page_ref: add tracepoint to track down page reference manipulation
CMA allocation should be guaranteed to succeed by definition, but, unfortunately, it would be failed sometimes. It is hard to track down the problem, because it is related to page reference manipulation and we don't have any facility to analyze it. This patch adds tracepoints to track down page reference manipulation. With it, we can find exact reason of failure and can fix the problem. Following is an example of tracepoint output. (note: this example is stale version that printing flags as the number. Recent version will print it as human readable string.) <...>-9018 [004] 92.678375: page_ref_set: pfn=0x17ac9 flags=0x0 count=1 mapcount=0 mapping=(nil) mt=4 val=1 <...>-9018 [004] 92.678378: kernel_stack: => get_page_from_freelist (ffffffff81176659) => __alloc_pages_nodemask (ffffffff81176d22) => alloc_pages_vma (ffffffff811bf675) => handle_mm_fault (ffffffff8119e693) => __do_page_fault (ffffffff810631ea) => trace_do_page_fault (ffffffff81063543) => do_async_page_fault (ffffffff8105c40a) => async_page_fault (ffffffff817581d8) [snip] <...>-9018 [004] 92.678379: page_ref_mod: pfn=0x17ac9 flags=0x40048 count=2 mapcount=1 mapping=0xffff880015a78dc1 mt=4 val=1 [snip] ... ... <...>-9131 [001] 93.174468: test_pages_isolated: start_pfn=0x17800 end_pfn=0x17c00 fin_pfn=0x17ac9 ret=fail [snip] <...>-9018 [004] 93.174843: page_ref_mod_and_test: pfn=0x17ac9 flags=0x40068 count=0 mapcount=0 mapping=0xffff880015a78dc1 mt=4 val=-1 ret=1 => release_pages (ffffffff8117c9e4) => free_pages_and_swap_cache (ffffffff811b0697) => tlb_flush_mmu_free (ffffffff81199616) => tlb_finish_mmu (ffffffff8119a62c) => exit_mmap (ffffffff811a53f7) => mmput (ffffffff81073f47) => do_exit (ffffffff810794e9) => do_group_exit (ffffffff81079def) => SyS_exit_group (ffffffff81079e74) => entry_SYSCALL_64_fastpath (ffffffff817560b6) This output shows that problem comes from exit path. In exit path, to improve performance, pages are not freed immediately. They are gathered and processed by batch. During this process, migration cannot be possible and CMA allocation is failed. This problem is hard to find without this page reference tracepoint facility. Enabling this feature bloat kernel text 30 KB in my configuration. text data bss dec hex filename 12127327 2243616 1507328 15878271 f2487f vmlinux_disabled 12157208 2258880 1507328 15923416 f2f8d8 vmlinux_enabled Note that, due to header file dependency problem between mm.h and tracepoint.h, this feature has to open code the static key functions for tracepoints. Proposed by Steven Rostedt in following link. [ crypto/async_pq: use __free_page() instead of put_page()] [ fix build failure for xtensa] [ tweak Kconfig text, per Vlastimil] Signed-off-by: Joonsoo Kim <> Acked-by: Michal Nazarewicz <> Acked-by: Vlastimil Babka <> Cc: Minchan Kim <> Cc: Mel Gorman <> Cc: "Kirill A. Shutemov" <> Cc: Sergey Senozhatsky <> Acked-by: Steven Rostedt <> Signed-off-by: Arnd Bergmann <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
Diffstat (limited to 'mm')
3 files changed, 68 insertions, 0 deletions
diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index 5c50b238b770..22f4cd96acb0 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -79,3 +79,16 @@ config PAGE_POISONING_ZERO
Enabling page poisoning with this option will disable hibernation
If unsure, say N
+ bool
+ bool "Enable tracepoint to track down page reference manipulation"
+ depends on DEBUG_KERNEL
+ depends on TRACEPOINTS
+ ---help---
+ This is a feature to add tracepoint for tracking down page reference
+ manipulation. This tracking is useful to diagnose functional failure
+ due to migration failures caused by page reference mismatches. Be
+ careful when enabling this feature because it adds about 30 KB to the
+ kernel code. However the runtime performance overhead is virtually
+ nil until the tracepoints are actually enabled.
diff --git a/mm/Makefile b/mm/Makefile
index cfdd481d27a5..6da300a1414b 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -81,3 +81,4 @@ obj-$(CONFIG_CMA_DEBUGFS) += cma_debug.o
obj-$(CONFIG_USERFAULTFD) += userfaultfd.o
obj-$(CONFIG_IDLE_PAGE_TRACKING) += page_idle.o
obj-$(CONFIG_FRAME_VECTOR) += frame_vector.o
+obj-$(CONFIG_DEBUG_PAGE_REF) += debug_page_ref.o
diff --git a/mm/debug_page_ref.c b/mm/debug_page_ref.c
new file mode 100644
index 000000000000..1aef3d562e52
--- /dev/null
+++ b/mm/debug_page_ref.c
@@ -0,0 +1,54 @@
+#include <linux/mm_types.h>
+#include <linux/tracepoint.h>
+#include <trace/events/page_ref.h>
+void __page_ref_set(struct page *page, int v)
+ trace_page_ref_set(page, v);
+void __page_ref_mod(struct page *page, int v)
+ trace_page_ref_mod(page, v);
+void __page_ref_mod_and_test(struct page *page, int v, int ret)
+ trace_page_ref_mod_and_test(page, v, ret);
+void __page_ref_mod_and_return(struct page *page, int v, int ret)
+ trace_page_ref_mod_and_return(page, v, ret);
+void __page_ref_mod_unless(struct page *page, int v, int u)
+ trace_page_ref_mod_unless(page, v, u);
+void __page_ref_freeze(struct page *page, int v, int ret)
+ trace_page_ref_freeze(page, v, ret);
+void __page_ref_unfreeze(struct page *page, int v)
+ trace_page_ref_unfreeze(page, v);