path: root/mm/init-mm.c
diff options
authorJason Gunthorpe <>2020-12-14 19:05:44 -0800
committerLinus Torvalds <>2020-12-15 12:13:39 -0800
commit57efa1fe5957694fa541c9062de0a127f0b9acb0 (patch)
tree4d38c9c0560e072e5cc36f925e89968bf37d4f43 /mm/init-mm.c
parentc28b1fc70390df32e29991eedd52bd86e7aba080 (diff)
mm/gup: prevent gup_fast from racing with COW during fork
Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes") pages under a FOLL_PIN will not be write protected during COW for fork. This means that pages returned from pin_user_pages(FOLL_WRITE) should not become write protected while the pin is active. However, there is a small race where get_user_pages_fast(FOLL_PIN) can establish a FOLL_PIN at the same time copy_present_page() is write protecting it: CPU 0 CPU 1 get_user_pages_fast() internal_get_user_pages_fast() copy_page_range() pte_alloc_map_lock() copy_present_page() atomic_read(has_pinned) == 0 page_maybe_dma_pinned() == false atomic_set(has_pinned, 1); gup_pgd_range() gup_pte_range() pte_t pte = gup_get_pte(ptep) pte_access_permitted(pte) try_grab_compound_head() pte = pte_wrprotect(pte) set_pte_at(); pte_unmap_unlock() // GUP now returns with a write protected page The first attempt to resolve this by using the write protect caused problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") Instead wrap copy_p4d_range() with the write side of a seqcount and check the read side around gup_pgd_range(). If there is a collision then get_user_pages_fast() fails and falls back to slow GUP. Slow GUP is safe against this race because copy_page_range() is only called while holding the exclusive side of the mmap_lock on the src mm_struct. [ coding style fixes] Link: Link: Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()") Signed-off-by: Jason Gunthorpe <> Suggested-by: Linus Torvalds <> Reviewed-by: John Hubbard <> Reviewed-by: Jan Kara <> Reviewed-by: Peter Xu <> Acked-by: "Ahmed S. Darwish" <> [seqcount_t parts] Cc: Andrea Arcangeli <> Cc: "Aneesh Kumar K.V" <> Cc: Christoph Hellwig <> Cc: Hugh Dickins <> Cc: Jann Horn <> Cc: Kirill Shutemov <> Cc: Kirill Tkhai <> Cc: Leon Romanovsky <> Cc: Michal Hocko <> Cc: Oleg Nesterov <> Signed-off-by: Andrew Morton <> Signed-off-by: Linus Torvalds <>
Diffstat (limited to 'mm/init-mm.c')
1 files changed, 1 insertions, 0 deletions
diff --git a/mm/init-mm.c b/mm/init-mm.c
index 3a613c85f9ed..153162669f80 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -31,6 +31,7 @@ struct mm_struct init_mm = {
.pgd = swapper_pg_dir,
.mm_users = ATOMIC_INIT(2),
.mm_count = ATOMIC_INIT(1),
+ .write_protect_seq = SEQCNT_ZERO(init_mm.write_protect_seq),
.page_table_lock = __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock),
.arg_lock = __SPIN_LOCK_UNLOCKED(init_mm.arg_lock),