summaryrefslogtreecommitdiff
path: root/arch/powerpc/include/asm/book3s
AgeCommit message (Collapse)Author
2023-06-19powerpc/dexcr: Add initial Dynamic Execution Control Register (DEXCR) supportBenjamin Gray
ISA 3.1B introduces the Dynamic Execution Control Register (DEXCR). It is a per-cpu register that allows control over various CPU behaviours including branch hint usage, indirect branch speculation, and hashst/hashchk support. Add some definitions and basic support for the DEXCR in the kernel. Right now it just * Initialises the DEXCR and HASHKEYR to a fixed value when a CPU onlines. * Clears them in reset_sprs(). * Detects when the NPHIE aspect is supported (the others don't get looked at in this series, so there's no need to waste a CPU_FTR on them). We initialise the HASHKEYR to ensure that all cores have the same key, so an HV enforced NPHIE + swapping cores doesn't randomly crash a process using hash instructions. The stores to HASHKEYR are unconditional because the ISA makes no mention of the SPR being missing if support for doing the hashes isn't present. So all that would happen is the HASHKEYR value gets ignored. This helps slightly if NPHIE detection fails; e.g., we currently only detect it on pseries. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> [mpe: Use simple values for DEXCR constants] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230616034846.311705-4-bgray@linux.ibm.com
2023-06-19powerpc/book3s: Add missing <linux/sched.h> includeBenjamin Gray
The functions here use struct task_struct fields, so need to import the full definition from <linux/sched.h>. The <asm/current.h> header that defines current only forward declares struct task_struct. Failing to include this <linux/sched.h> header leads to a compilation error when a translation unit does not also include <linux/sched.h> indirectly. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230616034846.311705-2-bgray@linux.ibm.com
2023-04-27Merge tag 'mm-stable-2023-04-27-15-30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of switching from a user process to a kernel thread. - More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav. - zsmalloc performance improvements from Sergey Senozhatsky. - Yue Zhao has found and fixed some data race issues around the alteration of memcg userspace tunables. - VFS rationalizations from Christoph Hellwig: - removal of most of the callers of write_one_page() - make __filemap_get_folio()'s return value more useful - Luis Chamberlain has changed tmpfs so it no longer requires swap backing. Use `mount -o noswap'. - Qi Zheng has made the slab shrinkers operate locklessly, providing some scalability benefits. - Keith Busch has improved dmapool's performance, making part of its operations O(1) rather than O(n). - Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd, permitting userspace to wr-protect anon memory unpopulated ptes. - Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather than exclusive, and has fixed a bunch of errors which were caused by its unintuitive meaning. - Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature, which causes minor faults to install a write-protected pte. - Vlastimil Babka has done some maintenance work on vma_merge(): cleanups to the kernel code and improvements to our userspace test harness. - Cleanups to do_fault_around() by Lorenzo Stoakes. - Mike Rapoport has moved a lot of initialization code out of various mm/ files and into mm/mm_init.c. - Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for DRM, but DRM doesn't use it any more. - Lorenzo has also coverted read_kcore() and vread() to use iterators and has thereby removed the use of bounce buffers in some cases. - Lorenzo has also contributed further cleanups of vma_merge(). - Chaitanya Prakash provides some fixes to the mmap selftesting code. - Matthew Wilcox changes xfs and afs so they no longer take sleeping locks in ->map_page(), a step towards RCUification of pagefaults. - Suren Baghdasaryan has improved mmap_lock scalability by switching to per-VMA locking. - Frederic Weisbecker has reworked the percpu cache draining so that it no longer causes latency glitches on cpu isolated workloads. - Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig logic. - Liu Shixin has changed zswap's initialization so we no longer waste a chunk of memory if zswap is not being used. - Yosry Ahmed has improved the performance of memcg statistics flushing. - David Stevens has fixed several issues involving khugepaged, userfaultfd and shmem. - Christoph Hellwig has provided some cleanup work to zram's IO-related code paths. - David Hildenbrand has fixed up some issues in the selftest code's testing of our pte state changing. - Pankaj Raghav has made page_endio() unneeded and has removed it. - Peter Xu contributed some rationalizations of the userfaultfd selftests. - Yosry Ahmed has fixed an issue around memcg's page recalim accounting. - Chaitanya Prakash has fixed some arm-related issues in the selftests/mm code. - Longlong Xia has improved the way in which KSM handles hwpoisoned pages. - Peter Xu fixes a few issues with uffd-wp at fork() time. - Stefan Roesch has changed KSM so that it may now be used on a per-process and per-cgroup basis. * tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm,unmap: avoid flushing TLB in batch if PTE is inaccessible shmem: restrict noswap option to initial user namespace mm/khugepaged: fix conflicting mods to collapse_file() sparse: remove unnecessary 0 values from rc mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area() hugetlb: pte_alloc_huge() to replace huge pte_alloc_map() maple_tree: fix allocation in mas_sparse_area() mm: do not increment pgfault stats when page fault handler retries zsmalloc: allow only one active pool compaction context selftests/mm: add new selftests for KSM mm: add new KSM process and sysfs knobs mm: add new api to enable ksm per process mm: shrinkers: fix debugfs file permissions mm: don't check VMA write permissions if the PTE/PMD indicates write permissions migrate_pages_batch: fix statistics for longterm pin retry userfaultfd: use helper function range_in_vma() lib/show_mem.c: use for_each_populated_zone() simplify code mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list() fs/buffer: convert create_page_buffers to folio_create_buffers fs/buffer: add folio_create_empty_buffers helper ...
2023-03-28mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()Gerald Schaefer
s390 can do more fine-grained handling of spurious TLB protection faults, when there also is the PTE pointer available. Therefore, pass on the PTE pointer to flush_tlb_fix_spurious_fault() as an additional parameter. This will add no functional change to other architectures, but those with private flush_tlb_fix_spurious_fault() implementations need to be made aware of the new parameter. Link: https://lkml.kernel.org/r/20230306161548.661740-1-gerald.schaefer@linux.ibm.com Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: David Hildenbrand <david@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-03-28powerpc/64s: Fix __pte_needs_flush() false positive warningBenjamin Gray
Userspace PROT_NONE ptes set _PAGE_PRIVILEGED, triggering a false positive debug assertion that __pte_flags_need_flush() is not called on a kernel mapping. Detect when it is a userspace PROT_NONE page by checking the required bits of PAGE_NONE are set, and none of the RWX bits are set. pte_protnone() is insufficient here because it always returns 0 when CONFIG_NUMA_BALANCING=n. Fixes: b11931e9adc1 ("powerpc/64s: add pte_needs_flush and huge_pmd_needs_flush") Cc: stable@vger.kernel.org # v6.1+ Reported-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230302225947.81083-1-bgray@linux.ibm.com
2023-02-23Merge tag 'mm-stable-2023-02-20-13-37' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - Daniel Verkamp has contributed a memfd series ("mm/memfd: add F_SEAL_EXEC") which permits the setting of the memfd execute bit at memfd creation time, with the option of sealing the state of the X bit. - Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset() thread-safe for pmd unshare") which addresses a rare race condition related to PMD unsharing. - Several folioification patch serieses from Matthew Wilcox, Vishal Moola, Sidhartha Kumar and Lorenzo Stoakes - Johannes Weiner has a series ("mm: push down lock_page_memcg()") which does perform some memcg maintenance and cleanup work. - SeongJae Park has added DAMOS filtering to DAMON, with the series "mm/damon/core: implement damos filter". These filters provide users with finer-grained control over DAMOS's actions. SeongJae has also done some DAMON cleanup work. - Kairui Song adds a series ("Clean up and fixes for swap"). - Vernon Yang contributed the series "Clean up and refinement for maple tree". - Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It adds to MGLRU an LRU of memcgs, to improve the scalability of global reclaim. - David Hildenbrand has added some userfaultfd cleanup work in the series "mm: uffd-wp + change_protection() cleanups". - Christoph Hellwig has removed the generic_writepages() library function in the series "remove generic_writepages". - Baolin Wang has performed some maintenance on the compaction code in his series "Some small improvements for compaction". - Sidhartha Kumar is doing some maintenance work on struct page in his series "Get rid of tail page fields". - David Hildenbrand contributed some cleanup, bugfixing and generalization of pte management and of pte debugging in his series "mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap PTEs". - Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation flag in the series "Discard __GFP_ATOMIC". - Sergey Senozhatsky has improved zsmalloc's memory utilization with his series "zsmalloc: make zspage chain size configurable". - Joey Gouly has added prctl() support for prohibiting the creation of writeable+executable mappings. The previous BPF-based approach had shortcomings. See "mm: In-kernel support for memory-deny-write-execute (MDWE)". - Waiman Long did some kmemleak cleanup and bugfixing in the series "mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF". - T.J. Alumbaugh has contributed some MGLRU cleanup work in his series "mm: multi-gen LRU: improve". - Jiaqi Yan has provided some enhancements to our memory error statistics reporting, mainly by presenting the statistics on a per-node basis. See the series "Introduce per NUMA node memory error statistics". - Mel Gorman has a second and hopefully final shot at fixing a CPU-hog regression in compaction via his series "Fix excessive CPU usage during compaction". - Christoph Hellwig does some vmalloc maintenance work in the series "cleanup vfree and vunmap". - Christoph Hellwig has removed block_device_operations.rw_page() in ths series "remove ->rw_page". - We get some maple_tree improvements and cleanups in Liam Howlett's series "VMA tree type safety and remove __vma_adjust()". - Suren Baghdasaryan has done some work on the maintainability of our vm_flags handling in the series "introduce vm_flags modifier functions". - Some pagemap cleanup and generalization work in Mike Rapoport's series "mm, arch: add generic implementation of pfn_valid() for FLATMEM" and "fixups for generic implementation of pfn_valid()" - Baoquan He has done some work to make /proc/vmallocinfo and /proc/kcore better represent the real state of things in his series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas". - Jason Gunthorpe rationalized the GUP system's interface to the rest of the kernel in the series "Simplify the external interface for GUP". - SeongJae Park wishes to migrate people from DAMON's debugfs interface over to its sysfs interface. To support this, we'll temporarily be printing warnings when people use the debugfs interface. See the series "mm/damon: deprecate DAMON debugfs interface". - Andrey Konovalov provided the accurately named "lib/stackdepot: fixes and clean-ups" series. - Huang Ying has provided a dramatic reduction in migration's TLB flush IPI rates with the series "migrate_pages(): batch TLB flushing". - Arnd Bergmann has some objtool fixups in "objtool warning fixes". * tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits) include/linux/migrate.h: remove unneeded externs mm/memory_hotplug: cleanup return value handing in do_migrate_range() mm/uffd: fix comment in handling pte markers mm: change to return bool for isolate_movable_page() mm: hugetlb: change to return bool for isolate_hugetlb() mm: change to return bool for isolate_lru_page() mm: change to return bool for folio_isolate_lru() objtool: add UACCESS exceptions for __tsan_volatile_read/write kmsan: disable ftrace in kmsan core code kasan: mark addr_has_metadata __always_inline mm: memcontrol: rename memcg_kmem_enabled() sh: initialize max_mapnr m68k/nommu: add missing definition of ARCH_PFN_OFFSET mm: percpu: fix incorrect size in pcpu_obj_full_size() maple_tree: reduce stack usage with gcc-9 and earlier mm: page_alloc: call panic() when memoryless node allocation fails mm: multi-gen LRU: avoid futile retries migrate_pages: move THP/hugetlb migration support check to simplify code migrate_pages: batch flushing TLB migrate_pages: share more code between _unmap and _move ...
2023-02-17powerpc/64s: Prevent fallthrough to hash TLB flush when using radixBenjamin Gray
In the fix reconnecting hash__tlb_flush() to tlb_flush() the void return on radix__tlb_flush() was not restored and subsequently falls through to the restored hash__tlb_flush(). Guard hash__tlb_flush() under an else to prevent this. Fixes: 1665c027afb2 ("powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()") Reported-by: "Erhard F." <erhard_f@mailbox.org> Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230217011434.115554-1-bgray@linux.ibm.com
2023-02-02mm: remove __HAVE_ARCH_PTE_SWP_EXCLUSIVEDavid Hildenbrand
__HAVE_ARCH_PTE_SWP_EXCLUSIVE is now supported by all architectures that support swp PTEs, so let's drop it. Link: https://lkml.kernel.org/r/20230113171026.582290-27-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02powerpc/mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on 32bit book3sDavid Hildenbrand
We already implemented support for 64bit book3s in commit bff9beaa2e80 ("powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3s") Let's support __HAVE_ARCH_PTE_SWP_EXCLUSIVE also in 32bit by reusing yet unused LSB 2 / MSB 29. There seems to be no real reason why that bit cannot be used, and reusing it avoids having to steal one bit from the swap offset. While at it, mask the type in __swp_entry(). Link: https://lkml.kernel.org/r/20230113171026.582290-18-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02powerpc/64s: Reconnect tlb_flush() to hash__tlb_flush()Michael Ellerman
Commit baf1ed24b27d ("powerpc/mm: Remove empty hash__ functions") removed some empty hash MMU flushing routines, but got a bit overeager and also removed the call to hash__tlb_flush() from tlb_flush(). In regular use this doesn't lead to any noticable breakage, which is a little concerning. Presumably there are flushes happening via other paths such as arch_leave_lazy_mmu_mode(), and/or a bit of luck. Fix it by reinstating the call to hash__tlb_flush(). Fixes: baf1ed24b27d ("powerpc/mm: Remove empty hash__ functions") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230131111407.806770-1-mpe@ellerman.id.au
2022-12-19Merge tag 'powerpc-6.2-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add powerpc qspinlock implementation optimised for large system scalability and paravirt. See the merge message for more details - Enable objtool to be built on powerpc to generate mcount locations - Use a temporary mm for code patching with the Radix MMU, so the writable mapping is restricted to the patching CPU - Add an option to build the 64-bit big-endian kernel with the ELFv2 ABI - Sanitise user registers on interrupt entry on 64-bit Book3S - Many other small features and fixes Thanks to Aboorva Devarajan, Angel Iglesias, Benjamin Gray, Bjorn Helgaas, Bo Liu, Chen Lifu, Christoph Hellwig, Christophe JAILLET, Christophe Leroy, Christopher M. Riedl, Colin Ian King, Deming Wang, Disha Goel, Dmitry Torokhov, Finn Thain, Geert Uytterhoeven, Gustavo A. R. Silva, Haowen Bai, Joel Stanley, Jordan Niethe, Julia Lawall, Kajol Jain, Laurent Dufour, Li zeming, Miaoqian Lin, Michael Jeanson, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Miehlbradt, Nicholas Piggin, Pali Rohár, Randy Dunlap, Rohan McLure, Russell Currey, Sathvika Vasireddy, Shaomin Deng, Stephen Kitt, Stephen Rothwell, Thomas Weißschuh, Tiezhu Yang, Uwe Kleine-König, Xie Shaowen, Xiu Jianfeng, XueBing Chen, Yang Yingliang, Zhang Jiaming, ruanjinjie, Jessica Yu, and Wolfram Sang. * tag 'powerpc-6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (181 commits) powerpc/code-patching: Fix oops with DEBUG_VM enabled powerpc/qspinlock: Fix 32-bit build powerpc/prom: Fix 32-bit build powerpc/rtas: mandate RTAS syscall filtering powerpc/rtas: define pr_fmt and convert printk call sites powerpc/rtas: clean up includes powerpc/rtas: clean up rtas_error_log_max initialization powerpc/pseries/eeh: use correct API for error log size powerpc/rtas: avoid scheduling in rtas_os_term() powerpc/rtas: avoid device tree lookups in rtas_os_term() powerpc/rtasd: use correct OF API for event scan rate powerpc/rtas: document rtas_call() powerpc/pseries: unregister VPA when hot unplugging a CPU powerpc/pseries: reset the RCU watchdogs after a LPM powerpc: Take in account addition CPU node when building kexec FDT powerpc: export the CPU node count powerpc/cpuidle: Set CPUIDLE_FLAG_POLLING for snooze state powerpc/dts/fsl: Fix pca954x i2c-mux node names cxl: Remove unnecessary cxl_pci_window_alignment() selftests/powerpc: Fix resource leaks ...
2022-11-30mm: remove unused savedwrite infrastructureDavid Hildenbrand
NUMA hinting no longer uses savedwrite, let's rip it out. ... and while at it, drop __pte_write() and __pmd_write() on ppc64. Link: https://lkml.kernel.org/r/20221108174652.198904-7-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nadav Amit <namit@vmware.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30powerpc/tlb: Add local flush for page given mm_struct and psizeBenjamin Gray
Adds a local TLB flush operation that works given an mm_struct, VA to flush, and page size representation. Most implementations mirror the surrounding code. The book3s/32/tlbflush.h implementation is left as a BUILD_BUG because it is more complicated and not required for anything as yet. This removes the need to create a vm_area_struct, which the temporary patching mm work does not need. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221109045112.187069-8-bgray@linux.ibm.com
2022-11-30powerpc/mm: Remove flush_all_mm, local_flush_all_mmBenjamin Gray
These functions were introduced for "cxl: Enable global TLBIs for cxl contexts" [1], which ended up using them for Radix only. They were never implemented on Hash (and creating an implementation appears to be difficult), so nothing can actually rely on them. They behave differently to the existing surrounding functions too, in that they actually need to do something on Hash. The other functions are primarily for use in generic code that expects their definitions, but Hash updates the TLB during PTE updates. After replacing the only usage with the Radix specific version, there are no more users of these functions, and given they are not implemented anyway it is safe to delete them. [1]: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20170903181513.29635-1-fbarrat@linux.vnet.ibm.com/ Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221109045112.187069-7-bgray@linux.ibm.com
2022-11-30powerpc/mm: Remove empty hash__ functionsBenjamin Gray
The empty hash__* functions are unnecessary. The empty definitions were introduced when 64-bit Hash support was added, as the functions were still used in generic code. These empty definitions were prefixed with hash__ when Radix support was added, and new wrappers with the original names were added that selected the Radix or Hash version based on radix_enabled(). But the hash__ prefixed functions were not part of a public interface, so there is no need to include them for compatibility with anything. Generic code will use the non-prefixed wrappers, and Hash specific code will know that there is no point in calling them (or even worse, call them and expect them to do something). Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221109045112.187069-5-bgray@linux.ibm.com
2022-10-18powerpc/64s: Disable preemption in hash lazy mmu modeNicholas Piggin
apply_to_page_range on kernel pages does not disable preemption, which is a requirement for hash's lazy mmu mode, which keeps track of the TLBs to flush with a per-cpu array. Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221013151647.1857994-1-npiggin@gmail.com
2022-09-30powerpc/mm/book3s/hash: Rename flush_tlb_pmd_rangeAneesh Kumar K.V
This function does the hash page table update. Hence rename it to indicate this better to avoid confusion with flush_pmd_tlb_range() Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Drop unnecessary extern] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220907081941.209501-1-aneesh.kumar@linux.ibm.com
2022-09-28powerpc/64s: Enable KFENCE on book3s64Nicholas Miehlbradt
KFENCE support was added for ppc32 in commit 90cbac0e995d ("powerpc: Enable KFENCE for PPC32"). Enable KFENCE on ppc64 architecture with hash and radix MMUs. It uses the same mechanism as debug pagealloc to protect/unprotect pages. All KFENCE kunit tests pass on both MMUs. KFENCE memory is initially allocated using memblock but is later marked as SLAB allocated. This necessitates the change to __pud_free to ensure that the KFENCE pages are freed appropriately. Based on previous work by Christophe Leroy and Jordan Niethe. Signed-off-by: Nicholas Miehlbradt <nicholas@linux.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220926075726.2846-4-nicholas@linux.ibm.com
2022-09-26powerpc/mm: Make PAGE_KERNEL_xxx macros grep-friendlyChristophe Leroy
Avoid multi-lines to help getting a complete view when using grep. They still remain under the 100 chars limit. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/3bc3f5a51949ee7f52dba36677db23d4337c7995.1662544980.git.christophe.leroy@csgroup.eu
2022-09-26powerpc/mm: Reduce redundancy in pgtable.hChristophe Leroy
PAGE_KERNEL_TEXT, PAGE_KERNEL_EXEC and PAGE_AGP are the same for all powerpcs. Remove duplicated definitions. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/92254499430d13d99e4a4d7e9ad8e8284cb35380.1662544974.git.christophe.leroy@csgroup.eu
2022-09-26powerpc/book3s: Inline first level of update_mmu_cache()Christophe Leroy
update_mmu_cache() voids when hash page tables are not used. On PPC32 that means when MMU_FTR_HPTE_TABLE is not defined. On PPC64 that means when RADIX is enabled. Rename core part of update_mmu_cache() as __update_mmu_cache() and include the initial verification in an inlined caller. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bea5ad0de7f83eff256116816d46c84fa0a444de.1662370698.git.christophe.leroy@csgroup.eu
2022-09-26powerpc/mm/64s: Drop p4d_leaf()Michael Ellerman
Because 64-bit Book3S uses pgtable-nop4d.h, the P4D is folded into the PGD. So P4D entries are actually PGD entries, or vice versa. The other way to think of it is that the P4D is a single entry page table below the PGD. Zero bits of the address are needed to index into the P4D, therefore a P4D entry maps the same size address space as a PGD entry. As explained in the previous commit, there are no huge page sizes supported directly at the PGD level on 64-bit Book3S, so there are also no huge page sizes supported at the P4D level. Therefore p4d_is_leaf() can never be true, so drop the definition and fallback to the default implementation that always returns false. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220903123640.719846-2-mpe@ellerman.id.au
2022-09-26powerpc/mm/64s: Drop pgd_huge()Michael Ellerman
On powerpc there are two ways for huge pages to be represented in the top level page table, aka PGD (Page Global Directory). If the address space mapped by an individual PGD entry does not correspond to a given huge page size, then the PGD entry points to a non-standard page table, known as a "hugepd" (Huge Page Directory). The hugepd contains some number of huge page PTEs sufficient to map the address space with the given huge page size. On the other hand, if the address space mapped by an individual PGD entry does correspond exactly to a given huge page size, that PGD entry is used to directly encode the huge page PTE in place. In this case the pgd_huge() wrapper indicates to generic code that the PGD entry is actually a huge page PTE. This commit deals with the pgd_huge() case only, it does nothing with respect to the hugepd case. Over time the size of the virtual address space supported on powerpc has increased several times, which means the location at which huge pages can sit in the tree has also changed. There have also been new huge page sizes added, with the introduction of the Radix MMU. On Power9 and later with the Radix MMU, the largest huge page size in any implementation is 1GB. Since the introduction of Radix, 1GB entries have been supported at the PUD level, with both 4K and 64K base page size. Radix has never had a supported huge page size at the PGD level. On Power8 or earlier, which uses the Hash MMU, or Power9 or later with the Hash MMU enabled, the largest huge page size is 16GB. Using the Hash MMU and a base page size of 4K, 16GB has never been a supported huge page size at the PGD level, due to the geometry being incompatible. The two supported huge page sizes (16M & 16GB) both use the hugepd format. Using the Hash MMU and a base page size of 64K, 16GB pages were supported in the past at the PGD level. However in commit ba95b5d03596 ("powerpc/mm/book3s/64: Rework page table geometry for lower memory usage") the page table layout was reworked to shrink the size of the PGD. As a result the 16GB page size now fits at the PUD level when using 64K base page size. Therefore there are no longer any supported configurations where pgd_huge() can be true, so drop the definitions for pgd_huge(), and fallback to the generic definition which is always false. Fixes: ba95b5d03596 ("powerpc/mm/book3s/64: Rework page table geometry for lower memory usage") Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220903123640.719846-1-mpe@ellerman.id.au
2022-09-08powerpc/64s: add pte_needs_flush and huge_pmd_needs_flushNicholas Piggin
Allow PTE changes to avoid flushing the TLB when access permissions are being relaxed, the dirty bit is being set, and the accessed bit is being changed. Relaxing access permissions and setting dirty and accessed bits do not require a flush because the MMU will re-load the PTE and notice the updates (it may also cause a spurious fault). Clearing the accessed bit does not require a flush because of the imprecise PTE accessed bit accounting that is already performed, as documented in ptep_clear_flush_young(). This reduces TLB flushing for some mprotect(2) calls. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/20220901110334.1618913-1-npiggin@gmail.com
2022-08-26powerpc/mm: Support execute-only memory on the Radix MMURussell Currey
Add support for execute-only memory (XOM) for the Radix MMU by using an execute-only mapping, as opposed to the RX mapping used by powerpc's other MMUs. The Hash MMU already supports XOM through the execute-only pkey, which is a separate mechanism shared with x86. A PROT_EXEC-only mapping will map to RX, and then the pkey will be applied on top of it. mmap() and mprotect() consumers in userspace should observe the same behaviour on Hash and Radix despite the differences in implementation. Replacing the vma_is_accessible() check in access_error() with a read check should be functionally equivalent for non-Radix MMUs, since it follows write and execute checks. For Radix, the change enables detecting faults on execute-only mappings where vma_is_accessible() would return true. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220817050640.406017-1-ruscur@russell.cc
2022-08-06Merge tag 'powerpc-6.0-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add support for syscall stack randomization - Add support for atomic operations to the 32 & 64-bit BPF JIT - Full support for KASAN on 64-bit Book3E - Add a watchdog driver for the new PowerVM hypervisor watchdog - Add a number of new selftests for the Power10 PMU support - Add a driver for the PowerVM Platform KeyStore - Increase the NMI watchdog timeout during live partition migration, to avoid timeouts due to increased memory access latency - Add support for using the 'linux,pci-domain' device tree property for PCI domain assignment - Many other small features and fixes Thanks to Alexey Kardashevskiy, Andy Shevchenko, Arnd Bergmann, Athira Rajeev, Bagas Sanjaya, Christophe Leroy, Erhard Furtner, Fabiano Rosas, Greg Kroah-Hartman, Greg Kurz, Haowen Bai, Hari Bathini, Jason A. Donenfeld, Jason Wang, Jiang Jian, Joel Stanley, Juerg Haefliger, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Masahiro Yamada, Maxime Bizon, Miaoqian Lin, Murilo Opsfelder Araújo, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Ning Qiang, Pali Rohár, Petr Mladek, Rashmica Gupta, Sachin Sant, Scott Cheloha, Segher Boessenkool, Stephen Rothwell, Uwe Kleine-König, Wolfram Sang, Xiu Jianfeng, and Zhouyi Zhou. * tag 'powerpc-6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (191 commits) powerpc/64e: Fix kexec build error EDAC/ppc_4xx: Include required of_irq header directly powerpc/pci: Fix PHB numbering when using opal-phbid powerpc/64: Init jump labels before parse_early_param() selftests/powerpc: Avoid GCC 12 uninitialised variable warning powerpc/cell/axon_msi: Fix refcount leak in setup_msi_msg_address powerpc/xive: Fix refcount leak in xive_get_max_prio powerpc/spufs: Fix refcount leak in spufs_init_isolated_loader powerpc/perf: Include caps feature for power10 DD1 version powerpc: add support for syscall stack randomization powerpc: Move system_call_exception() to syscall.c powerpc/powernv: rename remaining rng powernv_ functions to pnv_ powerpc/powernv/kvm: Use darn for H_RANDOM on Power9 powerpc/powernv: Avoid crashing if rng is NULL selftests/powerpc: Fix matrix multiply assist test powerpc/signal: Update comment for clarity powerpc: make facility_unavailable_exception 64s powerpc/platforms/83xx/suspend: Remove write-only global variable powerpc/platforms/83xx/suspend: Prevent unloading the driver powerpc/platforms/83xx/suspend: Reorder to get rid of a forward declaration ...
2022-07-27powerpc/64s: Remove spurious fault flushing for NMMUNicholas Piggin
Commit 6d8278c414cb2 ("powerpc/64s/radix: do not flush TLB on spurious fault") removed the TLB flush for spurious faults, except when a coprocessor (nest MMU) maps the address space. This is not needed because the NMMU workaround in the PTE permission upgrade paths prevents PTEs existing with less restrictive access permissions than their corresponding TLB entries have. Remove it and replace with a comment. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220525022358.780745-4-npiggin@gmail.com
2022-06-29powerpc: Include asm/firmware.h in all users of firmware_has_feature()Christophe Leroy
Trying to remove asm/ppc_asm.h from all places that don't need it leads to several failures linked to firmware_has_feature(). To fix it, include asm/firmware.h in all files using firmware_has_feature() All users found with: git grep -L "firmware\.h" ` git grep -l "firmware_has_feature("` Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/11956ec181a034b51a881ac9c059eea72c679a73.1651828453.git.christophe.leroy@csgroup.eu
2022-06-27docs: rename Documentation/vm to Documentation/mmMike Rapoport
so it will be consistent with code mm directory and with Documentation/admin-guide/mm and won't be confused with virtual machines. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Tested-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Jonathan Corbet <corbet@lwn.net> Acked-by: Wu XiangCheng <bobwxc@email.cn>
2022-05-28Merge tag 'powerpc-5.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later) - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later - Drop support for system call instruction emulation - Many other small features and fixes Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng. * tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits) powerpc/64: Include cache.h directly in paca.h powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set powerpc/xics: Include missing header powerpc/powernv/pci: Drop VF MPS fixup powerpc/fsl_book3e: Don't set rodata RO too early powerpc/microwatt: Add mmu bits to device tree powerpc/powernv/flash: Check OPAL flash calls exist before using powerpc/powermac: constify device_node in of_irq_parse_oldworld() powerpc/powermac: add missing g5_phy_disable_cpu1() declaration selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch" powerpc: Enable the DAWR on POWER9 DD2.3 and above powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask powerpc: Fix all occurences of "the the" selftests/powerpc/pmu/ebb: remove fixed_instruction.S powerpc/platforms/83xx: Use of_device_get_match_data() powerpc/eeh: Drop redundant spinlock initialization powerpc/iommu: Add missing of_node_put in iommu_init_early_dart powerpc/pseries/vas: Call misc_deregister if sysfs init fails powerpc/papr_scm: Fix leaking nvdimm_events_map elements ...
2022-05-24powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU ↵Christophe Leroy
is set When CONFIG_PPC_64S_HASH_MMU is not set, slice.c is not built and arch_get_unmapped_area() and arch_get_unmapped_area_topdown() are not provided because RADIX uses the generic ones. Therefore, neither set HAVE_ARCH_UNMAPPED_AREA nor HAVE_ARCH_UNMAPPED_AREA_TOPDOWN. Fixes: ab57bd7570d4 ("powerpc/mm: Move get_unmapped_area functions to slice.c") Reported-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e438c6cc09f94085e56733ed2d6e84333c35292a.1653370913.git.christophe.leroy@csgroup.eu
2022-05-22powerpc: Book3S 64-bit outline-only KASAN supportDaniel Axtens
Implement a limited form of KASAN for Book3S 64-bit machines running under the Radix MMU, supporting only outline mode. - Enable the compiler instrumentation to check addresses and maintain the shadow region. (This is the guts of KASAN which we can easily reuse.) - Require kasan-vmalloc support to handle modules and anything else in vmalloc space. - KASAN needs to be able to validate all pointer accesses, but we can't instrument all kernel addresses - only linear map and vmalloc. On boot, set up a single page of read-only shadow that marks all iomap and vmemmap accesses as valid. - Document KASAN in powerpc docs. Background ---------- KASAN support on Book3S is a bit tricky to get right: - It would be good to support inline instrumentation so as to be able to catch stack issues that cannot be caught with outline mode. - Inline instrumentation requires a fixed offset. - Book3S runs code with translations off ("real mode") during boot, including a lot of generic device-tree parsing code which is used to determine MMU features. [ppc64 mm note: The kernel installs a linear mapping at effective address c000...-c008.... This is a one-to-one mapping with physical memory from 0000... onward. Because of how memory accesses work on powerpc 64-bit Book3S, a kernel pointer in the linear map accesses the same memory both with translations on (accessing as an 'effective address'), and with translations off (accessing as a 'real address'). This works in both guests and the hypervisor. For more details, see s5.7 of Book III of version 3 of the ISA, in particular the Storage Control Overview, s5.7.3, and s5.7.5 - noting that this KASAN implementation currently only supports Radix.] - Some code - most notably a lot of KVM code - also runs with translations off after boot. - Therefore any offset has to point to memory that is valid with translations on or off. One approach is just to give up on inline instrumentation. This way boot-time checks can be delayed until after the MMU is set is up, and we can just not instrument any code that runs with translations off after booting. Take this approach for now and require outline instrumentation. Previous attempts allowed inline instrumentation. However, they came with some unfortunate restrictions: only physically contiguous memory could be used and it had to be specified at compile time. Maybe we can do better in the future. [paulus@ozlabs.org - Rebased onto 5.17. Note that a kernel with CONFIG_KASAN=y will crash during boot on a machine using HPT translation because not all the entry points to the generic KASAN code are protected with a call to kasan_arch_is_ready().] Originally-by: Balbir Singh <bsingharora@gmail.com> # ppc64 out-of-line radix version Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> [mpe: Update copyright year and comment formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YoTE69OQwiG7z+Gu@cleo
2022-05-09powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3sDavid Hildenbrand
Right now, the last 5 bits (0x1f) of the swap entry are used for the type and the bit before that (0x20) is used for _PAGE_SWP_SOFT_DIRTY. We cannot use 0x40, as that collides with _RPAGE_RSV1 -- contained in _PAGE_HPTEFLAGS. The next candidate would be _RPAGE_SW3 (0x200) -- which is used for _PAGE_SOFT_DIRTY for !swp ptes. So let's just use _PAGE_SOFT_DIRTY for _PAGE_SWP_SOFT_DIRTY (to make it easier to grasp) and use 0x20 now for _PAGE_SWP_EXCLUSIVE. Link: https://lkml.kernel.org/r/20220329164329.208407-9-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Don Dutile <ddutile@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Liang Zhang <zhangliang5@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09powerpc/pgtable: remove _PAGE_BIT_SWAP_TYPE for book3sDavid Hildenbrand
The swap type is simply stored in bits 0x1f of the swap pte. Let's simplify by just getting rid of _PAGE_BIT_SWAP_TYPE. It's not like that we can simply change it: _PAGE_SWP_SOFT_DIRTY would suddenly fall into _RPAGE_RSV1, which isn't possible and would make the BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY) angry. While at it, make it clearer which bit we're actually using for _PAGE_SWP_SOFT_DIRTY by just using the proper define and introduce and use SWP_TYPE_MASK. Link: https://lkml.kernel.org/r/20220329164329.208407-8-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Don Dutile <ddutile@redhat.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Liang Zhang <zhangliang5@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-05powerpc/mm: Move get_unmapped_area functions to slice.cChristophe Leroy
hugetlb_get_unmapped_area() is now identical to the generic version if only RADIX is enabled, so move it to slice.c and let it fallback on the generic one when HASH MMU is not compiled in. Do the same with arch_get_unmapped_area() and arch_get_unmapped_area_topdown(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b5d9c124e82889e0cb115c150915a0c0d84eb960.1649523076.git.christophe.leroy@csgroup.eu
2022-05-05powerpc/mm: Use generic_hugetlb_get_unmapped_area()Christophe Leroy
Use the generic version of arch_hugetlb_get_unmapped_area() which is now available at all time. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/05f77014c619061638ecc52a0a4136eb04cc2799.1649523076.git.christophe.leroy@csgroup.eu
2022-05-05powerpc/mm: Make slice specific to book3s/64Christophe Leroy
Since commit 555904d07eef ("powerpc/8xx: MM_SLICE is not needed anymore") only book3s/64 selects CONFIG_PPC_MM_SLICES. Move slice.c into mm/book3s64/ Move necessary stuff in asm/book3s/64/slice.h and remove asm/slice.h Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4a0d74ef1966a5902b5fd4ac4b513a760a6d675a.1649523076.git.christophe.leroy@csgroup.eu
2022-03-25Merge tag 'powerpc-5.18-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Livepatch support for 32-bit is probably the standout new feature, otherwise mostly just lots of bits and pieces all over the board. There's a series of commits cleaning up function descriptor handling, which touches a few other arches as well as LKDTM. It has acks from Arnd, Kees and Helge. Summary: - Enforce kernel RO, and implement STRICT_MODULE_RWX for 603. - Add support for livepatch to 32-bit. - Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS. - Merge vdso64 and vdso32 into a single directory. - Fix build errors with newer binutils. - Add support for UADDR64 relocations, which are emitted by some toolchains. This allows powerpc to build with the latest lld. - Fix (another) potential userspace r13 corruption in transactional memory handling. - Cleanups of function descriptor handling & related fixes to LKDTM. Thanks to Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh Kumar K.V, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chen Jingwen, Christophe JAILLET, Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel Henrique Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu Hua, Haren Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason Wang, Jeremy Kerr, Joachim Wiberg, Jordan Niethe, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mamatha Inamdar, Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal Suchanek, Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nour-eddine Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy Dunlap, Ritesh Harjani, Rohan McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain, Thierry Reding, Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean, Wedson Almeida Filho, and YueHaibing" * tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits) powerpc/pseries: Fix use after free in remove_phb_dynamic() powerpc/time: improve decrementer clockevent processing powerpc/time: Fix KVM host re-arming a timer beyond decrementer range powerpc/tm: Fix more userspace r13 corruption powerpc/xive: fix return value of __setup handler powerpc/64: Add UADDR64 relocation support powerpc: 8xx: fix a return value error in mpc8xx_pic_init powerpc/ps3: remove unneeded semicolons powerpc/64: Force inlining of prevent_user_access() and set_kuap() powerpc/bitops: Force inlining of fls() powerpc: declare unmodified attribute_group usages const powerpc/spufs: Fix build warning when CONFIG_PROC_FS=n powerpc/secvar: fix refcount leak in format_show() powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E powerpc: Move C prototypes out of asm-prototypes.h powerpc/kexec: Declare kexec_paca static powerpc/smp: Declare current_set static powerpc: Cleanup asm-prototypes.c powerpc/ftrace: Use STK_GOT in ftrace_mprofile.S powerpc/ftrace: Regroup PPC64 specific operations in ftrace_mprofile.S ...
2022-03-22Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecacheLinus Torvalds
Pull folio updates from Matthew Wilcox: - Rewrite how munlock works to massively reduce the contention on i_mmap_rwsem (Hugh Dickins): https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/ - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig): https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/ - Convert GUP to use folios and make pincount available for order-1 pages. (Matthew Wilcox) - Convert a few more truncation functions to use folios (Matthew Wilcox) - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox) - Convert rmap_walk to use folios (Matthew Wilcox) - Convert most of shrink_page_list() to use a folio (Matthew Wilcox) - Add support for creating large folios in readahead (Matthew Wilcox) * tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits) mm/damon: minor cleanup for damon_pa_young selftests/vm/transhuge-stress: Support file-backed PMD folios mm/filemap: Support VM_HUGEPAGE for file mappings mm/readahead: Switch to page_cache_ra_order mm/readahead: Align file mappings for non-DAX mm/readahead: Add large folio readahead mm: Support arbitrary THP sizes mm: Make large folios depend on THP mm: Fix READ_ONLY_THP warning mm/filemap: Allow large folios to be added to the page cache mm: Turn can_split_huge_page() into can_split_folio() mm/vmscan: Convert pageout() to take a folio mm/vmscan: Turn page_check_references() into folio_check_references() mm/vmscan: Account large folios correctly mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios mm/vmscan: Free non-shmem folios without splitting them mm/rmap: Constify the rmap_walk_control argument mm/rmap: Convert rmap_walk() to take a folio mm: Turn page_anon_vma() into folio_anon_vma() mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read() ...
2022-03-21powerpc: Add pmd_pfn()Matthew Wilcox (Oracle)
This is straightforward for everything except nohash64 where we indirect through pmd_page(). There must be a better way to do this. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-08powerpc/64: Force inlining of prevent_user_access() and set_kuap()Christophe Leroy
A ppc64_defconfig build exhibits about 10 copied of prevent_user_access(). It also have one copy of set_kuap(). c000000000017340 <.prevent_user_access.constprop.0>: c00000000001a038: 4b ff d3 09 bl c000000000017340 <.prevent_user_access.constprop.0> c00000000001aabc: 4b ff c8 85 bl c000000000017340 <.prevent_user_access.constprop.0> c00000000001ab38: 4b ff c8 09 bl c000000000017340 <.prevent_user_access.constprop.0> c00000000001ade0: 4b ff c5 61 bl c000000000017340 <.prevent_user_access.constprop.0> c000000000039b90 <.prevent_user_access.constprop.0>: c00000000003ac08: 4b ff ef 89 bl c000000000039b90 <.prevent_user_access.constprop.0> c00000000003b9d0: 4b ff e1 c1 bl c000000000039b90 <.prevent_user_access.constprop.0> c00000000003ba54: 4b ff e1 3d bl c000000000039b90 <.prevent_user_access.constprop.0> c00000000003bbfc: 4b ff df 95 bl c000000000039b90 <.prevent_user_access.constprop.0> c00000000015dde0 <.prevent_user_access.constprop.0>: c0000000001612c0: 4b ff cb 21 bl c00000000015dde0 <.prevent_user_access.constprop.0> c000000000161b54: 4b ff c2 8d bl c00000000015dde0 <.prevent_user_access.constprop.0> c000000000188cf0 <.prevent_user_access.constprop.0>: c00000000018d658: 4b ff b6 99 bl c000000000188cf0 <.prevent_user_access.constprop.0> c00000000030fe20 <.prevent_user_access.constprop.0>: c0000000003123d4: 4b ff da 4d bl c00000000030fe20 <.prevent_user_access.constprop.0> c000000000313970: 4b ff c4 b1 bl c00000000030fe20 <.prevent_user_access.constprop.0> c0000000005e6bd0 <.prevent_user_access.constprop.0>: c0000000005e7d8c: 4b ff ee 45 bl c0000000005e6bd0 <.prevent_user_access.constprop.0> c0000000007bcae0 <.prevent_user_access.constprop.0>: c0000000007bda10: 4b ff f0 d1 bl c0000000007bcae0 <.prevent_user_access.constprop.0> c0000000007bda54: 4b ff f0 8d bl c0000000007bcae0 <.prevent_user_access.constprop.0> c0000000007bdd28: 4b ff ed b9 bl c0000000007bcae0 <.prevent_user_access.constprop.0> c0000000007c0390: 4b ff c7 51 bl c0000000007bcae0 <.prevent_user_access.constprop.0> c00000000094e4f0 <.prevent_user_access.constprop.0>: c000000000950e40: 4b ff d6 b1 bl c00000000094e4f0 <.prevent_user_access.constprop.0> c00000000097d2d0 <.prevent_user_access.constprop.0>: c0000000009813fc: 4b ff be d5 bl c00000000097d2d0 <.prevent_user_access.constprop.0> c000000000acd540 <.prevent_user_access.constprop.0>: c000000000ad1d60: 4b ff b7 e1 bl c000000000acd540 <.prevent_user_access.constprop.0> c000000000e5d680 <.prevent_user_access.constprop.0>: c000000000e64b60: 4b ff 8b 21 bl c000000000e5d680 <.prevent_user_access.constprop.0> c000000000e64b6c: 4b ff 8b 15 bl c000000000e5d680 <.prevent_user_access.constprop.0> c000000000e64c38: 4b ff 8a 49 bl c000000000e5d680 <.prevent_user_access.constprop.0> When building signal_64.c with -Winline the following messages appear: ./arch/powerpc/include/asm/book3s/64/kup.h:331:20: error: inlining failed in call to 'set_kuap': call is unlikely and code size would grow [-Werror=inline] ./arch/powerpc/include/asm/book3s/64/kup.h:401:20: error: inlining failed in call to 'prevent_user_access.constprop': call is unlikely and code size would grow [-Werror=inline] Those functions are used on hot pathes and have been expected to be inlined at all time. Force them inline. This patch reduces the kernel text size by 700 bytes, confirming that not inlining those functions is not worth it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/eff9b2b211957fa2e8707e46f31674097fd563a3.1644588972.git.christophe.leroy@csgroup.eu
2022-03-05powerpc/64s: Fix build failure when CONFIG_PPC_64S_HASH_MMU is not setMurilo Opsfelder Araujo
The following build failure occurs when CONFIG_PPC_64S_HASH_MMU is not set: arch/powerpc/kernel/setup_64.c: In function ‘setup_per_cpu_areas’: arch/powerpc/kernel/setup_64.c:811:21: error: ‘mmu_linear_psize’ undeclared (first use in this function); did you mean ‘mmu_virtual_psize’? 811 | if (mmu_linear_psize == MMU_PAGE_4K) | ^~~~~~~~~~~~~~~~ | mmu_virtual_psize arch/powerpc/kernel/setup_64.c:811:21: note: each undeclared identifier is reported only once for each function it appears in Move the declaration of mmu_linear_psize outside of CONFIG_PPC_64S_HASH_MMU ifdef. After the above is fixed, it fails later with the following error: ld: arch/powerpc/kexec/file_load_64.o: in function `.arch_kexec_kernel_image_probe': file_load_64.c:(.text+0x1c1c): undefined reference to `.add_htab_mem_range' Fix that, too, by conditioning add_htab_mem_range() symbol to CONFIG_PPC_64S_HASH_MMU. Fixes: 387e220a2e5e ("powerpc/64s: Move hash MMU support code under CONFIG_PPC_64S_HASH_MMU") Reported-by: Erhard F. <erhard_f@mailbox.org> Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215567 Link: https://lore.kernel.org/r/20220301204743.45133-1-muriloo@linux.ibm.com
2022-02-03powerpc/32s: Make pte_update() non atomic on 603 coreChristophe Leroy
On 603 core, TLB miss handler don't do any change to the page tables so pte_update() doesn't need to be atomic. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/cc89d3c11fc9c742d0df3454a657a3a00be24046.1643538554.git.christophe.leroy@csgroup.eu
2022-01-24powerpc/fixmap: Fix VM debug warning on unmapChristophe Leroy
Unmapping a fixmap entry is done by calling __set_fixmap() with FIXMAP_PAGE_CLEAR as flags. Today, powerpc __set_fixmap() calls map_kernel_page(). map_kernel_page() is not happy when called a second time for the same page. WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/pgtable.c:194 set_pte_at+0xc/0x1e8 CPU: 0 PID: 1 Comm: swapper Not tainted 5.16.0-rc3-s3k-dev-01993-g350ff07feb7d-dirty #682 NIP: c0017cd4 LR: c00187f0 CTR: 00000010 REGS: e1011d50 TRAP: 0700 Not tainted (5.16.0-rc3-s3k-dev-01993-g350ff07feb7d-dirty) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42000208 XER: 00000000 GPR00: c0165fec e1011e10 c14c0000 c0ee2550 ff800000 c0f3d000 00000000 c001686c GPR08: 00001000 b00045a9 00000001 c0f58460 c0f50000 00000000 c0007e10 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 GPR24: 00000000 00000000 c0ee2550 00000000 c0f57000 00000ff8 00000000 ff800000 NIP [c0017cd4] set_pte_at+0xc/0x1e8 LR [c00187f0] map_kernel_page+0x9c/0x100 Call Trace: [e1011e10] [c0736c68] vsnprintf+0x358/0x6c8 (unreliable) [e1011e30] [c0165fec] __set_fixmap+0x30/0x44 [e1011e40] [c0c13bdc] early_iounmap+0x11c/0x170 [e1011e70] [c0c06cb0] ioremap_legacy_serial_console+0x88/0xc0 [e1011e90] [c0c03634] do_one_initcall+0x80/0x178 [e1011ef0] [c0c0385c] kernel_init_freeable+0xb4/0x250 [e1011f20] [c0007e34] kernel_init+0x24/0x140 [e1011f30] [c0016268] ret_from_kernel_thread+0x5c/0x64 Instruction dump: 7fe3fb78 48019689 80010014 7c630034 83e1000c 5463d97e 7c0803a6 38210010 4e800020 81250000 712a0001 41820008 <0fe00000> 9421ffe0 93e1001c 48000030 Implement unmap_kernel_page() which clears an existing pte. Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/b0b752f6f6ecc60653e873f385c6f0dce4e9ab6a.1638789098.git.christophe.leroy@csgroup.eu
2022-01-16powerpc/32s: Fix kasan_init_region() for KASANChristophe Leroy
It has been reported some configuration where the kernel doesn't boot with KASAN enabled. This is due to wrong BAT allocation for the KASAN area: ---[ Data Block Address Translation ]--- 0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw m 1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw m 2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw m 3: 0xf8000000-0xf9ffffff 0x2a000000 32M Kernel rw m 4: 0xfa000000-0xfdffffff 0x2c000000 64M Kernel rw m A BAT must have both virtual and physical addresses alignment matching the size of the BAT. This is not the case for BAT 4 above. Fix kasan_init_region() by using block_size() function that is in book3s32/mmu.c. To be able to reuse it here, make it non static and change its name to bat_block_size() in order to avoid name conflict with block_size() defined in <linux/blkdev.h> Also reuse find_free_bat() to avoid an error message from setbat() when no BAT is available. And allocate memory outside of linear memory mapping to avoid wasting that precious space. With this change we get correct alignment for BATs and KASAN shadow memory is allocated outside the linear memory space. ---[ Data Block Address Translation ]--- 0: 0xc0000000-0xcfffffff 0x00000000 256M Kernel rw 1: 0xd0000000-0xdfffffff 0x10000000 256M Kernel rw 2: 0xe0000000-0xefffffff 0x20000000 256M Kernel rw 3: 0xf8000000-0xfbffffff 0x7c000000 64M Kernel rw 4: 0xfc000000-0xfdffffff 0x7a000000 32M Kernel rw Fixes: 7974c4732642 ("powerpc/32s: Implement dedicated kasan_init_region()") Cc: stable@vger.kernel.org Reported-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7a50ef902494d1325227d47d33dada01e52e5518.1641818726.git.christophe.leroy@csgroup.eu
2021-12-23powerpc/pseries: Add __init attribute to eligible functionsNick Child
Some functions defined in 'arch/powerpc/platforms/pseries' are deserving of an `__init` macro attribute. These functions are only called by other initialization functions and therefore should inherit the attribute. Also, change function declarations in header files to include `__init`. Signed-off-by: Nick Child <nick.child@ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211216220035.605465-13-nick.child@ibm.com
2021-12-09powerpc/kuap: Prepare for supporting KUAP on BOOK3E/64Christophe Leroy
Also call kuap_lock() and kuap_save_and_lock() from interrupt functions with CONFIG_PPC64. For book3s/64 we keep them empty as it is done in assembly. Also do the locked assert when switching task unless it is book3s/64. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1cbf94e26e6d6e2e028fd687588a7e6622d454a6.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/kuap: Add kuap_lock()Christophe Leroy
Add kuap_lock() and call it when entering interrupts from user. It is called kuap_lock() as it is similar to kuap_save_and_lock() without the save. However book3s/32 already have a kuap_lock(). Rename it kuap_lock_addr(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4437e2deb9f6f549f7089d45e9c6f96a7e77905a.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/kuap: Remove __kuap_assert_locked()Christophe Leroy
__kuap_assert_locked() is redundant with __kuap_get_and_assert_locked(). Move the verification of CONFIG_PPC_KUAP_DEBUG in kuap_assert_locked() and make it call __kuap_get_and_assert_locked() directly. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1a60198a25d2ba38a37f1b92bc7d096435df4224.1634627931.git.christophe.leroy@csgroup.eu
2021-12-09powerpc/kuap: Check KUAP activation in generic functionsChristophe Leroy
Today, every platform checks that KUAP is not de-activated before doing the real job. Move the verification out of platform specific functions. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/894f110397fcd248e125fb855d1e863e4e633a0d.1634627931.git.christophe.leroy@csgroup.eu