From 15a64f5a8870b5610b616a4aa753262dfaa5d76e Mon Sep 17 00:00:00 2001 From: Claudio Imbrenda Date: Thu, 24 Jun 2021 18:39:36 -0700 Subject: mm/vmalloc: add vmalloc_no_huge Patch series "mm: add vmalloc_no_huge and use it", v4. Add vmalloc_no_huge() and export it, so modules can allocate memory with small pages. Use the newly added vmalloc_no_huge() in KVM on s390 to get around a hardware limitation. This patch (of 2): Commit 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") added support for hugepage vmalloc mappings, it also added the flag VM_NO_HUGE_VMAP for __vmalloc_node_range to request the allocation to be performed with 0-order non-huge pages. This flag is not accessible when calling vmalloc, the only option is to call directly __vmalloc_node_range, which is not exported. This means that a module can't vmalloc memory with small pages. Case in point: KVM on s390x needs to vmalloc a large area, and it needs to be mapped with non-huge pages, because of a hardware limitation. This patch adds the function vmalloc_no_huge, which works like vmalloc, but it is guaranteed to always back the mapping using small pages. This new function is exported, therefore it is usable by modules. [akpm@linux-foundation.org: whitespace fixes, per Christoph] Link: https://lkml.kernel.org/r/20210614132357.10202-1-imbrenda@linux.ibm.com Link: https://lkml.kernel.org/r/20210614132357.10202-2-imbrenda@linux.ibm.com Fixes: 121e6f3258fe3 ("mm/vmalloc: hugepage vmalloc mappings") Signed-off-by: Claudio Imbrenda Reviewed-by: Uladzislau Rezki (Sony) Acked-by: Nicholas Piggin Reviewed-by: David Hildenbrand Acked-by: David Rientjes Cc: Uladzislau Rezki (Sony) Cc: Catalin Marinas Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Christoph Hellwig Cc: Cornelia Huck Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/vmalloc.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include') diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 4d668abb6391..bfaaf0b6fa76 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -135,6 +135,7 @@ extern void *__vmalloc_node_range(unsigned long size, unsigned long align, const void *caller); void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, int node, const void *caller); +void *vmalloc_no_huge(unsigned long size); extern void vfree(const void *addr); extern void vfree_atomic(const void *addr); -- cgit v1.2.3 From fe19bd3dae3d15d2fbfdb3de8839a6ea0fe94264 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Thu, 24 Jun 2021 18:39:52 -0700 Subject: mm, futex: fix shared futex pgoff on shmem huge page If more than one futex is placed on a shmem huge page, it can happen that waking the second wakes the first instead, and leaves the second waiting: the key's shared.pgoff is wrong. When 3.11 commit 13d60f4b6ab5 ("futex: Take hugepages into account when generating futex_key"), the only shared huge pages came from hugetlbfs, and the code added to deal with its exceptional page->index was put into hugetlb source. Then that was missed when 4.8 added shmem huge pages. page_to_pgoff() is what others use for this nowadays: except that, as currently written, it gives the right answer on hugetlbfs head, but nonsense on hugetlbfs tails. Fix that by calling hugetlbfs-specific hugetlb_basepage_index() on PageHuge tails as well as on head. Yes, it's unconventional to declare hugetlb_basepage_index() there in pagemap.h, rather than in hugetlb.h; but I do not expect anything but page_to_pgoff() ever to need it. [akpm@linux-foundation.org: give hugetlb_basepage_index() prototype the correct scope] Link: https://lkml.kernel.org/r/b17d946b-d09-326e-b42a-52884c36df32@google.com Fixes: 800d8c63b2e9 ("shmem: add huge pages support") Reported-by: Neel Natu Signed-off-by: Hugh Dickins Reviewed-by: Matthew Wilcox (Oracle) Acked-by: Thomas Gleixner Cc: "Kirill A. Shutemov" Cc: Zhang Yi Cc: Mel Gorman Cc: Mike Kravetz Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Darren Hart Cc: Davidlohr Bueso Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- include/linux/hugetlb.h | 16 ---------------- include/linux/pagemap.h | 13 +++++++------ 2 files changed, 7 insertions(+), 22 deletions(-) (limited to 'include') diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 6504346a1947..3c0117656745 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -741,17 +741,6 @@ static inline int hstate_index(struct hstate *h) return h - hstates; } -pgoff_t __basepage_index(struct page *page); - -/* Return page->index in PAGE_SIZE units */ -static inline pgoff_t basepage_index(struct page *page) -{ - if (!PageCompound(page)) - return page->index; - - return __basepage_index(page); -} - extern int dissolve_free_huge_page(struct page *page); extern int dissolve_free_huge_pages(unsigned long start_pfn, unsigned long end_pfn); @@ -988,11 +977,6 @@ static inline int hstate_index(struct hstate *h) return 0; } -static inline pgoff_t basepage_index(struct page *page) -{ - return page->index; -} - static inline int dissolve_free_huge_page(struct page *page) { return 0; diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index e89df447fae3..0f1b34dbf3a2 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -516,7 +516,7 @@ static inline struct page *read_mapping_page(struct address_space *mapping, } /* - * Get index of the page with in radix-tree + * Get index of the page within radix-tree (but not for hugetlb pages). * (TODO: remove once hugetlb pages will have ->index in PAGE_SIZE) */ static inline pgoff_t page_to_index(struct page *page) @@ -535,15 +535,16 @@ static inline pgoff_t page_to_index(struct page *page) return pgoff; } +extern pgoff_t hugetlb_basepage_index(struct page *page); + /* - * Get the offset in PAGE_SIZE. - * (TODO: hugepage should have ->index in PAGE_SIZE) + * Get the offset in PAGE_SIZE (even for hugetlb pages). + * (TODO: hugetlb pages should have ->index in PAGE_SIZE) */ static inline pgoff_t page_to_pgoff(struct page *page) { - if (unlikely(PageHeadHuge(page))) - return page->index << compound_order(page); - + if (unlikely(PageHuge(page))) + return hugetlb_basepage_index(page); return page_to_index(page); } -- cgit v1.2.3