summaryrefslogtreecommitdiff
path: root/arch/powerpc/mm/init-common.c
AgeCommit message (Collapse)Author
2024-10-23book3s64/radix: Refactoring common kfence related functionsRitesh Harjani (IBM)
Both radix and hash on book3s requires to detect if kfence early init is enabled or not. Hash needs to disable kfence if early init is not enabled because with kfence the linear map is mapped using PAGE_SIZE rather than 16M mapping. We don't support multiple page sizes for slb entry used for kernel linear map in book3s64. This patch refactors out the common functions required to detect kfence early init is enabled or not. Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/f4a787224fbe5bb787158ace579780c0257f6602.1729271995.git.ritesh.list@gmail.com
2024-08-12powerpc/mm: Fix size of allocated PGDIRChristophe Leroy
Commit 6b0e82791bd0 ("powerpc/e500: switch to 64 bits PGD on 85xx (32 bits)") increased the size of PGD entries but failed to increase the PGD directory. Use the size of pgd_t instead of the size of pointers to calculate the allocated size. Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 6b0e82791bd0 ("powerpc/e500: switch to 64 bits PGD on 85xx (32 bits)") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1cdaacb391cbd3e0240f0e0faf691202874e9422.1723109462.git.christophe.leroy@csgroup.eu
2024-07-21Merge tag 'mm-stable-2024-07-21-14-50' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - In the series "mm: Avoid possible overflows in dirty throttling" Jan Kara addresses a couple of issues in the writeback throttling code. These fixes are also targetted at -stable kernels. - Ryusuke Konishi's series "nilfs2: fix potential issues related to reserved inodes" does that. This should actually be in the mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad. - More folio conversions from Kefeng Wang in the series "mm: convert to folio_alloc_mpol()" - Kemeng Shi has sent some cleanups to the writeback code in the series "Add helper functions to remove repeated code and improve readability of cgroup writeback" - Kairui Song has made the swap code a little smaller and a little faster in the series "mm/swap: clean up and optimize swap cache index". - In the series "mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David Hildenbrand has reworked the rather sketchy handling of the use of the zeropage in MAP_SHARED mappings. I don't see any runtime effects here - more a cleanup/understandability/maintainablity thing. - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of higher addresses, for aarch64. The (poorly named) series is "Restructure va_high_addr_switch". - The core TLB handling code gets some cleanups and possible slight optimizations in Bang Li's series "Add update_mmu_tlb_range() to simplify code". - Jane Chu has improved the handling of our fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the series "Enhance soft hwpoison handling and injection". - Jeff Johnson has sent a billion patches everywhere to add MODULE_DESCRIPTION() to everything. Some landed in this pull. - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has simplified migration's use of hardware-offload memory copying. - Yosry Ahmed performs more folio API conversions in his series "mm: zswap: trivial folio conversions". - In the series "large folios swap-in: handle refault cases first", Chuanhua Han inches us forward in the handling of large pages in the swap code. This is a cleanup and optimization, working toward the end objective of full support of large folio swapin/out. - In the series "mm,swap: cleanup VMA based swap readahead window calculation", Huang Ying has contributed some cleanups and a possible fixlet to his VMA based swap readahead code. - In the series "add mTHP support for anonymous shmem" Baolin Wang has taught anonymous shmem mappings to use multisize THP. By default this is a no-op - users must opt in vis sysfs controls. Dramatic improvements in pagefault latency are realized. - David Hildenbrand has some cleanups to our remaining use of page_mapcount() in the series "fs/proc: move page_mapcount() to fs/proc/internal.h". - David also has some highmem accounting cleanups in the series "mm/highmem: don't track highmem pages manually". - Build-time fixes and cleanups from John Hubbard in the series "cleanups, fixes, and progress towards avoiding "make headers"". - Cleanups and consolidation of the core pagemap handling from Barry Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers and utilize them". - Lance Yang's series "Reclaim lazyfree THP without splitting" has reduced the latency of the reclaim of pmd-mapped THPs under fairly common circumstances. A 10x speedup is seen in a microbenchmark. It does this by punting to aother CPU but I guess that's a win unless all CPUs are pegged. - hugetlb_cgroup cleanups from Xiu Jianfeng in the series "mm/hugetlb_cgroup: rework on cftypes". - Miaohe Lin's series "Some cleanups for memory-failure" does just that thing. - Someone other than SeongJae has developed a DAMON feature in Honggyu Kim's series "DAMON based tiered memory management for CXL memory". This adds DAMON features which may be used to help determine the efficiency of our placement of CXL/PCIe attached DRAM. - DAMON user API centralization and simplificatio work in SeongJae Park's series "mm/damon: introduce DAMON parameters online commit function". - In the series "mm: page_type, zsmalloc and page_mapcount_reset()" David Hildenbrand does some maintenance work on zsmalloc - partially modernizing its use of pageframe fields. - Kefeng Wang provides more folio conversions in the series "mm: remove page_maybe_dma_pinned() and page_mkclean()". - More cleanup from David Hildenbrand, this time in the series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline() pages" and permits the removal of some virtio-mem hacks. - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()" is a cleanup to the anon folio handling in preparation for mTHP (multisize THP) swapin. - Kefeng Wang's series "mm: improve clear and copy user folio" implements more folio conversions, this time in the area of large folio userspace copying. - The series "Docs/mm/damon/maintaier-profile: document a mailing tool and community meetup series" tells people how to get better involved with other DAMON developers. From SeongJae Park. - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does that. - David Hildenbrand sends along more cleanups, this time against the migration code. The series is "mm/migrate: move NUMA hinting fault folio isolation + checks under PTL". - Jan Kara has found quite a lot of strangenesses and minor errors in the readahead code. He addresses this in the series "mm: Fix various readahead quirks". - SeongJae Park's series "selftests/damon: test DAMOS tried regions and {min,max}_nr_regions" adds features and addresses errors in DAMON's self testing code. - Gavin Shan has found a userspace-triggerable WARN in the pagecache code. The series "mm/filemap: Limit page cache size to that supported by xarray" addresses this. The series is marked cc:stable. - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations and cleanup" cleans up and slightly optimizes KSM. - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of code motion. The series (which also makes the memcg-v1 code Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put under config option" and "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1" - Dan Schatzberg's series "Add swappiness argument to memory.reclaim" adds an additional feature to this cgroup-v2 control file. - The series "Userspace controls soft-offline pages" from Jiaqi Yan permits userspace to stop the kernel's automatic treatment of excessive correctable memory errors. In order to permit userspace to monitor and handle this situation. - Kefeng Wang's series "mm: migrate: support poison recover from migrate folio" teaches the kernel to appropriately handle migration from poisoned source folios rather than simply panicing. - SeongJae Park's series "Docs/damon: minor fixups and improvements" does those things. - In the series "mm/zsmalloc: change back to per-size_class lock" Chengming Zhou improves zsmalloc's scalability and memory utilization. - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare refcount increments. So these paes can first be moved aside if they reside in the movable zone or a CMA block. - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps for much faster reading of vma information. The series is "query VMAs from /proc/<pid>/maps". - In the series "mm: introduce per-order mTHP split counters" Lance Yang improves the kernel's presentation of developer information related to multisize THP splitting. - Michael Ellerman has developed the series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)". This permits userspace to use all available huge page sizes. - In the series "revert unconditional slab and page allocator fault injection calls" Vlastimil Babka removes a performance-affecting and not very useful feature from slab fault injection. * tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits) mm/mglru: fix ineffective protection calculation mm/zswap: fix a white space issue mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix possible recursive locking detected warning mm/gup: clear the LRU flag of a page before adding to LRU batch mm/numa_balancing: teach mpol_to_str about the balancing mode mm: memcg1: convert charge move flags to unsigned long long alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting lib: reuse page_ext_data() to obtain codetag_ref lib: add missing newline character in the warning message mm/mglru: fix overshooting shrinker memory mm/mglru: fix div-by-zero in vmpressure_calc_level() mm/kmemleak: replace strncpy() with strscpy() mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB mm: ignore data-race in __swap_writepage hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr mm: shmem: rename mTHP shmem counters mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async() mm/migrate: putback split folios when numa hint migration fails ...
2024-07-12powerpc/mm: remove hugepd leftoversChristophe Leroy
All targets have now opted out of CONFIG_ARCH_HAS_HUGEPD so remove left over code. Link: https://lkml.kernel.org/r/39c0d0adee6790fc42cee9f458e05fb95136c3dd.1719928057.git.christophe.leroy@csgroup.eu Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Oscar Salvador <osalvador@suse.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-04powerpc/64s/radix/kfence: map __kfence_pool at page granularityHari Bathini
When KFENCE is enabled, total system memory is mapped at page level granularity. But in radix MMU mode, ~3GB additional memory is needed to map 100GB of system memory at page level granularity when compared to using 2MB direct mapping.This is not desired considering KFENCE is designed to be enabled in production kernels [1]. Mapping only the memory allocated for KFENCE pool at page granularity is sufficient to enable KFENCE support. So, allocate __kfence_pool during bootup and map it at page granularity instead of mapping all system memory at page granularity. Without patch: # cat /proc/meminfo MemTotal: 101201920 kB With patch: # cat /proc/meminfo MemTotal: 104483904 kB Note that enabling KFENCE at runtime is disabled for radix MMU for now, as it depends on the ability to split page table mappings and such APIs are not currently implemented for radix MMU. All kfence_test.c testcases passed with this patch. [1] https://lore.kernel.org/all/20201103175841.3495947-2-elver@google.com/ Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240701130021.578240-1-hbathini@linux.ibm.com
2023-12-13powerpc/mm: Fix null-pointer dereference in pgtable_cache_addKunwu Chan
kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Ensure the allocation was successful by checking the pointer validity. Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231204023223.2447523-1-chentao@kylinos.cn
2021-12-09powerpc/kuep: Remove 'nosmep' boot time parameter except for book3s/64Christophe Leroy
Deactivating KUEP at boot time is unrelevant for PPC32 and BOOK3E/64. Remove it. It allows to refactor setup_kuep() via a __weak function that only PPC64s will overide for now. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Fix CONFIG_PPC_BOOKS_64 -> CONFIG_PPC_BOOK3S_64 typo] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4c36df18b41c988c4512f45d96220486adbe4c99.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09Revert "powerpc: Inline setup_kup()"Christophe Leroy
This reverts commit 1791ebd131c46539b024c0f2ebf12b6c88a265b9. setup_kup() was inlined to manage conflict between PPC32 marking setup_{kuap/kuep}() __init and PPC64 not marking them __init. But in fact PPC32 has removed the __init mark for all but 8xx in order to properly handle SMP. In order to make setup_kup() grow a bit, revert the commit mentioned above but remove __init for 8xx as well so that we don't have to mark setup_kup() as __ref. Also switch the order so that KUAP is initialised before KUEP because on the 40x, KUEP will depend on the activation of KUAP. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7691088fd0994ee3c8db6298dc8c00259e3f6a7f.1634627931.git.christophe.leroy@csgroup.eu
2020-12-15powerpc: Inline setup_kup()Michael Ellerman
setup_kup() is used by both 64-bit and 32-bit code. However on 64-bit it must not be __init, because it's used for CPU hotplug, whereas on 32-bit it should be __init because it calls setup_kuap/kuep() which are __init. We worked around that problem in the past by marking it __ref, see commit 67d53f30e23e ("powerpc/mm: fix section mismatch for setup_kup()"). Marking it __ref basically just omits it from section mismatch checking, which can lead to bugs, and in fact it did, see commit 44b4c4450f8d ("powerpc/64s: Mark the kuap/kuep functions non __init") We can avoid all these problems by just making it static inline. Because all it does is call other functions, making it inline actually shrinks the 32-bit vmlinux by ~76 bytes. Make it __always_inline as pointed out by Christophe. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201214123011.311024-1-mpe@ellerman.id.au
2020-12-04powerpc/book3s64/kup: Check max key supported before enabling kupAneesh Kumar K.V
Don't enable KUEP/KUAP if we support less than or equal to 3 keys. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201202043854.76406-1-aneesh.kumar@linux.ibm.com
2020-06-09mm: reorder includes after introduction of linux/pgtable.hMike Rapoport
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-11-13powerpc: introduce kernstart_virt_addr to store the kernel baseJason Yan
Now the kernel base is a fixed value - KERNELBASE. To support KASLR, we need a variable to store the kernel base. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-11-13powerpc: move memstart_addr and kernstart_addr to init-common.cJason Yan
These two variables are both defined in init_32.c and init_64.c. Move them to init-common.c and make them __ro_after_init. Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Diana Craciun <diana.craciun@nxp.com> Tested-by: Diana Craciun <diana.craciun@nxp.com> Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-06powerpc/mm: fix section mismatch for setup_kup()Christophe Leroy
commit b28c97505eb1 ("powerpc/64: Setup KUP on secondary CPUs") moved setup_kup() out of the __init section. As stated in that commit, "this is only for 64-bit". But this function is also used on PPC32, where the two functions called by setup_kup() are in the __init section, so setup_kup() has to either be kept in the __init section on PPC32 or marked __ref. This patch marks it __ref, it fixes the below build warnings. MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x169ec): Section mismatch in reference from the function setup_kup() to the function .init.text:setup_kuep() The function setup_kup() references the function __init setup_kuep(). This is often because setup_kup lacks a __init annotation or the annotation of setup_kuep is wrong. WARNING: vmlinux.o(.text+0x16a04): Section mismatch in reference from the function setup_kup() to the function .init.text:setup_kuap() The function setup_kup() references the function __init setup_kuap(). This is often because setup_kup lacks a __init annotation or the annotation of setup_kuap is wrong. Fixes: b28c97505eb1 ("powerpc/64: Setup KUP on secondary CPUs") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-04-21powerpc/64: Setup KUP on secondary CPUsRussell Currey
Some platforms (i.e. Radix MMU) need per-CPU initialisation for KUP. Any platforms that only want to do KUP initialisation once globally can just check to see if they're running on the boot CPU, or check if whatever setup they need has already been performed. Note that this is only for 64-bit. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-04-21powerpc: Add a framework for Kernel Userspace Access ProtectionChristophe Leroy
This patch implements a framework for Kernel Userspace Access Protection. Then subarches will have the possibility to provide their own implementation by providing setup_kuap() and allow/prevent_user_access(). Some platforms will need to know the area accessed and whether it is accessed from read, write or both. Therefore source, destination and size and handed over to the two functions. mpe: Rename to allow/prevent rather than unlock/lock, and add read/write wrappers. Drop the 32-bit code for now until we have an implementation for it. Add kuap to pt_regs for 64-bit as well as 32-bit. Don't split strings, use pr_crit_ratelimited(). Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-04-21powerpc: Add skeleton for Kernel Userspace Execution PreventionChristophe Leroy
This patch adds a skeleton for Kernel Userspace Execution Prevention. Then subarches implementing it have to define CONFIG_PPC_HAVE_KUEP and provide setup_kuep() function. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> [mpe: Don't split strings, use pr_crit_ratelimited()] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-04-21powerpc: Add framework for Kernel Userspace ProtectionChristophe Leroy
This patch adds a skeleton for Kernel Userspace Protection functionnalities like Kernel Userspace Access Protection and Kernel Userspace Execution Prevention The subsequent implementation of KUAP for radix makes use of a MMU feature in order to patch out assembly when KUAP is disabled or unsupported. This won't work unless there's an entry point for KUP support before the feature magic happens, so for PPC64 setup_kup() is called early in setup. On PPC32, feature_fixup() is done too early to allow the same. Suggested-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: remove unnecessary test in pgtable_cache_init()Christophe Leroy
pgtable_cache_add() gracefully handles the case when a cache that size already exists by returning early with the following test: if (PGT_CACHE(shift)) return; /* Already have a cache of this size */ It is then not needed to test the existence of the cache before. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: fix a warning when a cache is common to PGD and hugepagesChristophe Leroy
While implementing TLB miss HW assistance on the 8xx, the following warning was encountered: [ 423.732965] WARNING: CPU: 0 PID: 345 at mm/slub.c:2412 ___slab_alloc.constprop.30+0x26c/0x46c [ 423.733033] CPU: 0 PID: 345 Comm: mmap Not tainted 4.18.0-rc8-00664-g2dfff9121c55 #671 [ 423.733075] NIP: c0108f90 LR: c0109ad0 CTR: 00000004 [ 423.733121] REGS: c455bba0 TRAP: 0700 Not tainted (4.18.0-rc8-00664-g2dfff9121c55) [ 423.733147] MSR: 00021032 <ME,IR,DR,RI> CR: 24224848 XER: 20000000 [ 423.733319] [ 423.733319] GPR00: c0109ad0 c455bc50 c4521910 c60053c0 007080c0 c0011b34 c7fa41e0 c455be30 [ 423.733319] GPR08: 00000001 c00103a0 c7fa41e0 c49afcc4 24282842 10018840 c079b37c 00000040 [ 423.733319] GPR16: 73f00000 00210d00 00000000 00000001 c455a000 00000100 00000200 c455a000 [ 423.733319] GPR24: c60053c0 c0011b34 007080c0 c455a000 c455a000 c7fa41e0 00000000 00009032 [ 423.734190] NIP [c0108f90] ___slab_alloc.constprop.30+0x26c/0x46c [ 423.734257] LR [c0109ad0] kmem_cache_alloc+0x210/0x23c [ 423.734283] Call Trace: [ 423.734326] [c455bc50] [00000100] 0x100 (unreliable) [ 423.734430] [c455bcc0] [c0109ad0] kmem_cache_alloc+0x210/0x23c [ 423.734543] [c455bcf0] [c0011b34] huge_pte_alloc+0xc0/0x1dc [ 423.734633] [c455bd20] [c01044dc] hugetlb_fault+0x408/0x48c [ 423.734720] [c455bdb0] [c0104b20] follow_hugetlb_page+0x14c/0x44c [ 423.734826] [c455be10] [c00e8e54] __get_user_pages+0x1c4/0x3dc [ 423.734919] [c455be80] [c00e9924] __mm_populate+0xac/0x140 [ 423.735020] [c455bec0] [c00db14c] vm_mmap_pgoff+0xb4/0xb8 [ 423.735127] [c455bf00] [c00f27c0] ksys_mmap_pgoff+0xcc/0x1fc [ 423.735222] [c455bf40] [c000e0f8] ret_from_syscall+0x0/0x38 [ 423.735271] Instruction dump: [ 423.735321] 7cbf482e 38fd0008 7fa6eb78 7fc4f378 4bfff5dd 7fe3fb78 4bfffe24 81370010 [ 423.735536] 71280004 41a2ff88 4840c571 4bffff80 <0fe00000> 4bfffeb8 81340010 712a0004 [ 423.735757] ---[ end trace e9b222919a470790 ]--- This warning occurs when calling kmem_cache_zalloc() on a cache having a constructor. In this case it happens because PGD cache and 512k hugepte cache are the same size (4k). While a cache with constructor is created for the PGD, hugepages create cache without constructor and uses kmem_cache_zalloc(). As both expect a cache with the same size, the hugepages reuse the cache created for PGD, hence the conflict. In order to avoid this conflict, this patch: - modifies pgtable_cache_add() so that a zeroising constructor is added for any cache size. - replaces calls to kmem_cache_zalloc() by kmem_cache_alloc() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-12-04powerpc/mm: enable the use of page table cache of order 0Christophe Leroy
hugepages uses a cache of order 0. Lets allow page tables of order 0 in the common part in order to avoid open coding in hugetlb Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-02-13powerpc/mm: Fix crashes with 16G huge pagesAneesh Kumar K.V
To support memory keys, we moved the hash pte slot information to the second half of the page table. This was ok with PTE entries at level 4 (PTE page) and level 3 (PMD). We already allocate larger page table pages at those levels to accomodate extra details. For level 4 we already have the extra space which was used to track 4k hash page table entry details and at level 3 the extra space was allocated to track the THP details. With hugetlbfs PTE, we used this extra space at the PMD level to store the slot details. But we also support hugetlbfs PTE at PUD level for 16GB pages and PUD level page didn't allocate extra space. This resulted in memory corruption. Fix this by allocating extra space at PUD level when HUGETLB is enabled. Fixes: bf9a95f9a648 ("powerpc: Free up four 64K PTE bits in 64K backed HPTE pages") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-14Merge branch 'topic/ppc-kvm' into nextMichael Ellerman
Merge the topic branch we're sharing with the kvm-ppc tree.
2017-01-31powerpc/64: Export pgtable_cache and pgtable_cache_add for KVMPaul Mackerras
This exports the pgtable_cache array and the pgtable_cache_add function so that HV KVM can use them for allocating radix page tables for guests. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-18powerpc: Fix pgtable pmd cache initNicholas Piggin
Commit 9b081e10805cd ("powerpc: port 64 bits pgtable_cache to 32 bits") mixed up PMD_INDEX_SIZE and PMD_CACHE_INDEX a couple of times. This resulted in 64s/hash/4k configs to panic at boot with a false positive error check. Fix that and simplify error handling by moving the check to the caller. Fixes: 9b081e10805cd ("powerpc: port 64 bits pgtable_cache to 32 bits") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-09powerpc: port 64 bits pgtable_cache to 32 bitsChristophe Leroy
Today powerpc64 uses a set of pgtable_caches while powerpc32 uses standard pages when using 4k pages and a single pgtable_cache if using other size pages. In preparation of implementing huge pages on the 8xx, this patch replaces the specific powerpc32 handling by the 64 bits approach. This is done by: * moving 64 bits pgtable_cache_add() and pgtable_cache_init() in a new file called init-common.c * modifying pgtable_cache_init() to also handle the case without PMD * removing the 32 bits version of pgtable_cache_add() and pgtable_cache_init() * copying related header contents from 64 bits into both the book3s/32 and nohash/32 header files On the 8xx, the following cache sizes will be used: * 4k pages mode: - PGT_CACHE(10) for PGD - PGT_CACHE(3) for 512k hugepage tables * 16k pages mode: - PGT_CACHE(6) for PGD - PGT_CACHE(7) for 512k hugepage tables - PGT_CACHE(3) for 8M hugepage tables Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Scott Wood <oss@buserror.net>