Age | Commit message (Collapse) | Author |
|
This reverts commit 1abce0580b89 ("powerpc/64s: Fix __pte_needs_flush()
false positive warning")
The previous patch dropped the usage of _PAGE_PRIVILEGED with PAGE_NONE.
Hence this check can be dropped.
Signed-off-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231204093638.71503-2-aneesh.kumar@kernel.org
|
|
There used to be a dependency on _PAGE_PRIVILEGED with pte_savedwrite.
But that got dropped by
commit 6a56ccbcf6c6 ("mm/autonuma: use can_change_(pte|pmd)_writable() to replace savedwrite")
With the change in this patch numa fault pte (pte_protnone()) gets mapped as regular user pte
with RWX cleared (no-access) whereas earlier it used to be mapped _PAGE_PRIVILEGED.
Hash fault handling code gets some WARN_ON added in this patch because
those functions are not expected to get called with _PAGE_READ cleared.
commit 18061c17c8ec ("powerpc/mm: Update PROTFAULT handling in the page
fault path") explains the details.
Signed-off-by: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231204093638.71503-1-aneesh.kumar@kernel.org
|
|
In the nestedv2 case, the L1 may register the L2's VPA with the L0. This
allows the L0 to manage the L2's dispatch count, as well as enable
possible performance optimisations by seeing if certain resources are
not being used by the L2 (such as the PMCs).
Use the H_GUEST_SET_STATE call to inform the L0 of the L2's VPA
address. This can not be done in the H_GUEST_VCPU_RUN input buffer.
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231201132618.555031-11-vaibhav@linux.ibm.com
|
|
The kvmppc_get_tb_offset() getter reloads KVMPPC_GSID_TB_OFFSET from the
L0 for nestedv2 host. This is unnecessary as the value does not change.
KVMPPC_GSID_TB_OFFSET also need not be reloaded in
kvmppc_{s,g}et_dec_expires().
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231201132618.555031-3-vaibhav@linux.ibm.com
|
|
An L0 must invalidate the L2's RPT during H_GUEST_DELETE if this has not
already been done. This is a slow operation that means H_GUEST_DELETE
must return H_BUSY multiple times before completing. Invalidating the
tables before deleting the guest so there is less work for the L0 to do.
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231201132618.555031-2-vaibhav@linux.ibm.com
|
|
HeXin Tech Co. has applied for a new PVN from the OpenPower Community
for its new processor C2000. The OpenPower has assigned a new PVN
and this newly assigned PVN is 0x0066, add pvr register related
support for this PVN.
Signed-off-by: Zhao Ke <ke.zhao@shingroup.cn>
Link: https://discuss.openpower.foundation/t/how-to-get-a-new-pvr-for-processors-follow-power-isa/477/10
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231129075845.57976-1-ke.zhao@shingroup.cn
|
|
With NUMA=n and FA_DUMP=y or PRESERVE_FA_DUMP=y the build fails with:
arch/powerpc/kernel/fadump.c:1739:22: error: no previous prototype for ‘arch_reserved_kernel_pages’ [-Werror=missing-prototypes]
1739 | unsigned long __init arch_reserved_kernel_pages(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
The prototype for arch_reserved_kernel_pages() is in include/linux/mm.h,
but it's guarded by __HAVE_ARCH_RESERVED_KERNEL_PAGES. The powerpc
headers define __HAVE_ARCH_RESERVED_KERNEL_PAGES in asm/mmzone.h, which
is not included into the generic headers when NUMA=n.
Move the definition of __HAVE_ARCH_RESERVED_KERNEL_PAGES into asm/mmu.h
which is included regardless of NUMA=n.
Additionally the ifdef around __HAVE_ARCH_RESERVED_KERNEL_PAGES needs to
also check for CONFIG_PRESERVE_FA_DUMP.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231130114433.3053544-1-mpe@ellerman.id.au
|
|
With CONFIG_NUMA=n the build fails with:
arch/powerpc/mm/book3s64/pgtable.c:275:15: error: no previous prototype for ‘create_section_mapping’ [-Werror=missing-prototypes]
275 | int __meminit create_section_mapping(unsigned long start, unsigned long end,
| ^~~~~~~~~~~~~~~~~~~~~~
That happens because the prototype for create_section_mapping() is in
asm/mmzone.h, but asm/mmzone.h is only included by linux/mmzone.h
when CONFIG_NUMA=y.
In fact the prototype is only needed by arch/powerpc/mm code, so move
the prototype into arch/powerpc/mm/mmu_decl.h, which also fixes the
build error.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231129131919.2528517-5-mpe@ellerman.id.au
|
|
As part of my quest to enable -Wmissing-prototypes by default,
these patches clean up some of the prototypes that are needed by all
architectures but are handled inconsistently.
The duplicate prototypes are moved into common code, which helps both
to clean up the existing warnings and simplifies the logic.
* asm-generic-prototypes:
arm64: vdso32: Define BUILD_VDSO32_64 to correct prototypes
csky: fix arch_jump_label_transform_static override
arch: add do_page_fault prototypes
arch: add missing prepare_ftrace_return() prototypes
arch: vdso: consolidate gettime prototypes
arch: include linux/cpu.h for trap_init() prototype
arch: fix asm-offsets.c building with -Wmissing-prototypes
arch: consolidate arch_irq_work_raise prototypes
|
|
The rtas_read_config() and rtas_write_config() functions in
kernel/rtas_pci.c have external linkage and two users in arch/powerpc:
the rtas_pci code itself and the pseries platform's "enhanced error
handling" (EEH) support code.
The prototypes for these functions in asm/ppc-pci.h have until now
been guarded by CONFIG_EEH since the only external caller is the
pseries EEH code. However, this presumably has always generated
warnings when built with !CONFIG_EEH and -Wmissing-prototypes:
arch/powerpc/kernel/rtas_pci.c:46:5: error: no previous prototype for
function 'rtas_read_config' [-Werror,-Wmissing-prototypes]
46 | int rtas_read_config(struct pci_dn *pdn, int where,
int size, u32 *val)
arch/powerpc/kernel/rtas_pci.c:98:5: error: no previous prototype for
function 'rtas_write_config' [-Werror,-Wmissing-prototypes]
98 | int rtas_write_config(struct pci_dn *pdn, int where,
int size, u32 val)
The introduction of commit c6345dfa6e3e ("Makefile.extrawarn: turn on
missing-prototypes globally") forces the issue.
The efika and chrp platform code have (static) functions with the same
names but different signatures. We may as well eliminate the potential
for conflicts and confusion by renaming the globally visible versions
as their prototypes get moved out of the CONFIG_EEH-guarded region;
their current names are too generic anyway. Since they operate on
objects of the type 'struct pci_dn *', give them the slightly more
verbose prefix "rtas_pci_dn_" and fix up all the call sites.
Fixes: c6345dfa6e3e ("Makefile.extrawarn: turn on missing-prototypes globally")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Closes: https://lore.kernel.org/linuxppc-dev/CA+G9fYt0LLXtjSz+Hkf3Fhm-kf0ZQanrhUS+zVZGa3O+Wt2+vg@mail.gmail.com/
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231127-rtas-pci-rw-config-v1-1-385d29ace3df@linux.ibm.com
|
|
Commit fb5a515704d7 ("powerpc: Remove platforms/wsp and associated
pieces") removed the A2 CPU support, but missed removal of reg_a2.h.
None of the defines contained in it are used, with the exception of the
SPRN_TEN* values, but they are also defined in reg_booke.h.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231113043947.1931831-1-mpe@ellerman.id.au
|
|
The prototype was hidden in an #ifdef on x86, which causes a warning:
kernel/irq_work.c:72:13: error: no previous prototype for 'arch_irq_work_raise' [-Werror=missing-prototypes]
Some architectures have a working prototype, while others don't.
Fix this by providing it in only one place that is always visible.
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The asm-generic/io.h already has default definition, remove unnecessary
arch's defination.
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Brian Cain <bcain@quicinc.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
ioremap_uc() is only meaningful on old x86-32 systems with the PAT
extension, and on ia64 with its slightly unconventional ioremap()
behavior. So remove the ioremap_uc() definition in architecutures
other than x86 and ia64. These architectures all have asm-generic/io.h
included and will have the default ioremap_uc() definition which
returns NULL.
This changes the existing behaviour, while no need to worry about
any breakage because in the only callsite of ioremap_uc(), code
has been adjusted to eliminate the impact. Please see
atyfb_setup_generic() of drivers/video/fbdev/aty/atyfb_base.c.
If any new invocation of ioremap_uc() need be added, please consider
using ioremap() intead or adding a ARCH specific version if necessary.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Helge Deller <deller@gmx.de> # parisc
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> (SuperH)
Cc: linux-alpha@vger.kernel.org
Cc: linux-hexagon@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
This header occasionally gains new function declarations without the
leading extern in accordance with current style rules. Leaving the
legacy externs in place is making the header more difficult to read
over time because of the inconsistency. Remove them.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
[mpe: Add names to rtas_call() parameters]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231106-rtas-trivial-v1-7-61847655c51f@linux.ibm.com
|
|
Use scripts/cleanfile to remove instances of trailing space in the
core RTAS code and header.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231106-rtas-trivial-v1-6-61847655c51f@linux.ibm.com
|
|
This is a pseries-specific function declaration that doesn't belong in
rtas.h. Move it to the pseries platform code and adjust
pseries/suspend.c accordingly.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231106-rtas-trivial-v1-5-61847655c51f@linux.ibm.com
|
|
rtas_service_present() has no more users.
rtas_function_implemented() is now the appropriate API for determining
whether a given RTAS function is available to call.
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231106-rtas-trivial-v1-4-61847655c51f@linux.ibm.com
|
|
The call_rtas() function has never been a part of arch/powerpc, and
its implementation was removed from arch/ppc by 0a26b1364f14 ("ppc:
Remove CHRP, POWER3 and POWER4 support from arch/ppc").
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231106-rtas-trivial-v1-3-61847655c51f@linux.ibm.com
|
|
Allmodconfig kernels produce a missing-prototypes warning:
arch/powerpc/platforms/ps3/gelic_udbg.c:239:6: error: no previous prototype for 'udbg_shutdown_ps3gelic' [-Werror=missing-prototypes]
Move the declaration from a local header to asm/ps3.h where it can be
seen from both the caller and the definition.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
[mpe: Drop CONFIG_PS3GELIC_UDBG to fix build error]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231108125843.3806765-18-arnd@kernel.org
|
|
Introduce several new KVM uAPIs to ultimately create a guest-first memory
subsystem within KVM, a.k.a. guest_memfd. Guest-first memory allows KVM
to provide features, enhancements, and optimizations that are kludgly
or outright impossible to implement in a generic memory subsystem.
The core KVM ioctl() for guest_memfd is KVM_CREATE_GUEST_MEMFD, which
similar to the generic memfd_create(), creates an anonymous file and
returns a file descriptor that refers to it. Again like "regular"
memfd files, guest_memfd files live in RAM, have volatile storage,
and are automatically released when the last reference is dropped.
The key differences between memfd files (and every other memory subystem)
is that guest_memfd files are bound to their owning virtual machine,
cannot be mapped, read, or written by userspace, and cannot be resized.
guest_memfd files do however support PUNCH_HOLE, which can be used to
convert a guest memory area between the shared and guest-private states.
A second KVM ioctl(), KVM_SET_MEMORY_ATTRIBUTES, allows userspace to
specify attributes for a given page of guest memory. In the long term,
it will likely be extended to allow userspace to specify per-gfn RWX
protections, including allowing memory to be writable in the guest
without it also being writable in host userspace.
The immediate and driving use case for guest_memfd are Confidential
(CoCo) VMs, specifically AMD's SEV-SNP, Intel's TDX, and KVM's own pKVM.
For such use cases, being able to map memory into KVM guests without
requiring said memory to be mapped into the host is a hard requirement.
While SEV+ and TDX prevent untrusted software from reading guest private
data by encrypting guest memory, pKVM provides confidentiality and
integrity *without* relying on memory encryption. In addition, with
SEV-SNP and especially TDX, accessing guest private memory can be fatal
to the host, i.e. KVM must be prevent host userspace from accessing
guest memory irrespective of hardware behavior.
Long term, guest_memfd may be useful for use cases beyond CoCo VMs,
for example hardening userspace against unintentional accesses to guest
memory. As mentioned earlier, KVM's ABI uses userspace VMA protections to
define the allow guest protection (with an exception granted to mapping
guest memory executable), and similarly KVM currently requires the guest
mapping size to be a strict subset of the host userspace mapping size.
Decoupling the mappings sizes would allow userspace to precisely map
only what is needed and with the required permissions, without impacting
guest performance.
A guest-first memory subsystem also provides clearer line of sight to
things like a dedicated memory pool (for slice-of-hardware VMs) and
elimination of "struct page" (for offload setups where userspace _never_
needs to DMA from or into guest memory).
guest_memfd is the result of 3+ years of development and exploration;
taking on memory management responsibilities in KVM was not the first,
second, or even third choice for supporting CoCo VMs. But after many
failed attempts to avoid KVM-specific backing memory, and looking at
where things ended up, it is quite clear that of all approaches tried,
guest_memfd is the simplest, most robust, and most extensible, and the
right thing to do for KVM and the kernel at-large.
The "development cycle" for this version is going to be very short;
ideally, next week I will merge it as is in kvm/next, taking this through
the KVM tree for 6.8 immediately after the end of the merge window.
The series is still based on 6.6 (plus KVM changes for 6.7) so it
will require a small fixup for changes to get_file_rcu() introduced in
6.7 by commit 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU").
The fixup will be done as part of the merge commit, and most of the text
above will become the commit message for the merge.
Pending post-merge work includes:
- hugepage support
- looking into using the restrictedmem framework for guest memory
- introducing a testing mechanism to poison memory, possibly using
the same memory attributes introduced here
- SNP and TDX support
There are two non-KVM patches buried in the middle of this series:
fs: Rename anon_inode_getfile_secure() and anon_inode_getfd_secure()
mm: Add AS_UNMOVABLE to mark mapping as completely unmovable
The first is small and mostly suggested-by Christian Brauner; the second
a bit less so but it was written by an mm person (Vlastimil Babka).
|
|
Convert KVM_ARCH_WANT_MMU_NOTIFIER into a Kconfig and select it where
appropriate to effectively maintain existing behavior. Using a proper
Kconfig will simplify building more functionality on top of KVM's
mmu_notifier infrastructure.
Add a forward declaration of kvm_gfn_range to kvm_types.h so that
including arch/powerpc/include/asm/kvm_ppc.h's with CONFIG_KVM=n doesn't
generate warnings due to kvm_gfn_range being undeclared. PPC defines
hooks for PR vs. HV without guarding them via #ifdeffery, e.g.
bool (*unmap_gfn_range)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*test_age_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
bool (*set_spte_gfn)(struct kvm *kvm, struct kvm_gfn_range *range);
Alternatively, PPC could forward declare kvm_gfn_range, but there's no
good reason not to define it in common KVM.
Acked-by: Anup Patel <anup@brainfault.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Message-Id: <20231027182217.3615211-8-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Finish a refactor of pgprot_framebuffer() which dependend
on some changes that were merged via the drm tree
- Fix some kernel-doc warnings to quieten the bots
Thanks to Nathan Lynch and Thomas Zimmermann.
* tag 'powerpc-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/rtas: Fix ppc_rtas_rmo_buf_show() kernel-doc
powerpc/pseries/rtas-work-area: Fix rtas_work_area_reserve_arena() kernel-doc
powerpc/fb: Call internal __phys_mem_access_prot() in fbdev code
powerpc: Remove file parameter from phys_mem_access_prot()
powerpc/machdep: Remove trailing whitespaces
|
|
Most architectures that support kprobes declare this function in their
own asm/kprobes.h header and provide an override, but some are missing
the prototype, which causes a warning for the __weak stub implementation:
kernel/kprobes.c:1865:12: error: no previous prototype for 'kprobe_exceptions_notify' [-Werror=missing-prototypes]
1865 | int __weak kprobe_exceptions_notify(struct notifier_block *self,
Move the prototype into linux/kprobes.h so it is visible to all
the definitions.
Link: https://lore.kernel.org/all/20231108125843.3806765-4-arnd@kernel.org/
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Call __phys_mem_access_prot() from the fbdev mmap helper
pgprot_framebuffer(). Allows to avoid the file argument of NULL.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230922080636.26762-6-tzimmermann@suse.de
|
|
Remove 'file' parameter from struct machdep_calls.phys_mem_access_prot
and its implementation in pci_phys_mem_access_prot(). The file is not
used on PowerPC. By removing it, a later patch can simplify fbdev's
mmap code, which uses phys_mem_access_prot() on PowerPC.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
[mpe: Rebase on unrelated changes to phys_mem_access_prot()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230922080636.26762-5-tzimmermann@suse.de
|
|
Fix coding style. No functional changes.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230922080636.26762-4-tzimmermann@suse.de
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Add support for KVM running as a nested hypervisor under development
versions of PowerVM, using the new PAPR nested virtualisation API
- Add support for the BPF prog pack allocator
- A rework of the non-server MMU handling to support execute-only on
all platforms
- Some optimisations & cleanups for the powerpc qspinlock code
- Various other small features and fixes
Thanks to Aboorva Devarajan, Aditya Gupta, Amit Machhiwal, Benjamin
Gray, Christophe Leroy, Dr. David Alan Gilbert, Gaurav Batra, Gautam
Menghani, Geert Uytterhoeven, Haren Myneni, Hari Bathini, Joel Stanley,
Jordan Niethe, Julia Lawall, Kautuk Consul, Kuan-Wei Chiu, Michael
Neuling, Minjie Du, Muhammad Muzammil, Naveen N Rao, Nicholas Piggin,
Nick Child, Nysal Jan K.A, Peter Lafreniere, Rob Herring, Sachin Sant,
Sebastian Andrzej Siewior, Shrikanth Hegde, Srikar Dronamraju, Stanislav
Kinsburskii, Vaibhav Jain, Wang Yufen, Yang Yingliang, and Yuan Tan.
* tag 'powerpc-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (100 commits)
powerpc/vmcore: Add MMU information to vmcoreinfo
Revert "powerpc: add `cur_cpu_spec` symbol to vmcoreinfo"
powerpc/bpf: use bpf_jit_binary_pack_[alloc|finalize|free]
powerpc/bpf: rename powerpc64_jit_data to powerpc_jit_data
powerpc/bpf: implement bpf_arch_text_invalidate for bpf_prog_pack
powerpc/bpf: implement bpf_arch_text_copy
powerpc/code-patching: introduce patch_instructions()
powerpc/32s: Implement local_flush_tlb_page_psize()
powerpc/pseries: use kfree_sensitive() in plpks_gen_password()
powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure
powerpc/fsl_msi: Use device_get_match_data()
powerpc: Remove cpm_dp...() macros
powerpc/qspinlock: Rename yield_propagate_owner tunable
powerpc/qspinlock: Propagate sleepy if previous waiter is preempted
powerpc/qspinlock: don't propagate the not-sleepy state
powerpc/qspinlock: propagate owner preemptedness rather than CPU number
powerpc/qspinlock: stop queued waiters trying to set lock sleepy
powerpc/perf: Fix disabling BHRB and instruction sampling
powerpc/trace: Add support for HAVE_FUNCTION_ARG_ACCESS_API
powerpc/tools: Pass -mabi=elfv2 to gcc-check-mprofile-kernel.sh
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
- Kemeng Shi has contributed some compation maintenance work in the
series 'Fixes and cleanups to compaction'
- Joel Fernandes has a patchset ('Optimize mremap during mutual
alignment within PMD') which fixes an obscure issue with mremap()'s
pagetable handling during a subsequent exec(), based upon an
implementation which Linus suggested
- More DAMON/DAMOS maintenance and feature work from SeongJae Park i
the following patch series:
mm/damon: misc fixups for documents, comments and its tracepoint
mm/damon: add a tracepoint for damos apply target regions
mm/damon: provide pseudo-moving sum based access rate
mm/damon: implement DAMOS apply intervals
mm/damon/core-test: Fix memory leaks in core-test
mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval
- In the series 'Do not try to access unaccepted memory' Adrian
Hunter provides some fixups for the recently-added 'unaccepted
memory' feature. To increase the feature's checking coverage. 'Plug
a few gaps where RAM is exposed without checking if it is
unaccepted memory'
- In the series 'cleanups for lockless slab shrink' Qi Zheng has done
some maintenance work which is preparation for the lockless slab
shrinking code
- Qi Zheng has redone the earlier (and reverted) attempt to make slab
shrinking lockless in the series 'use refcount+RCU method to
implement lockless slab shrink'
- David Hildenbrand contributes some maintenance work for the rmap
code in the series 'Anon rmap cleanups'
- Kefeng Wang does more folio conversions and some maintenance work
in the migration code. Series 'mm: migrate: more folio conversion
and unification'
- Matthew Wilcox has fixed an issue in the buffer_head code which was
causing long stalls under some heavy memory/IO loads. Some cleanups
were added on the way. Series 'Add and use bdev_getblk()'
- In the series 'Use nth_page() in place of direct struct page
manipulation' Zi Yan has fixed a potential issue with the direct
manipulation of hugetlb page frames
- In the series 'mm: hugetlb: Skip initialization of gigantic tail
struct pages if freed by HVO' has improved our handling of gigantic
pages in the hugetlb vmmemmep optimizaton code. This provides
significant boot time improvements when significant amounts of
gigantic pages are in use
- Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code
rationalization and folio conversions in the hugetlb code
- Yin Fengwei has improved mlock()'s handling of large folios in the
series 'support large folio for mlock'
- In the series 'Expose swapcache stat for memcg v1' Liu Shixin has
added statistics for memcg v1 users which are available (and
useful) under memcg v2
- Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable)
prctl so that userspace may direct the kernel to not automatically
propagate the denial to child processes. The series is named 'MDWE
without inheritance'
- Kefeng Wang has provided the series 'mm: convert numa balancing
functions to use a folio' which does what it says
- In the series 'mm/ksm: add fork-exec support for prctl' Stefan
Roesch makes is possible for a process to propagate KSM treatment
across exec()
- Huang Ying has enhanced memory tiering's calculation of memory
distances. This is used to permit the dax/kmem driver to use 'high
bandwidth memory' in addition to Optane Data Center Persistent
Memory Modules (DCPMM). The series is named 'memory tiering:
calculate abstract distance based on ACPI HMAT'
- In the series 'Smart scanning mode for KSM' Stefan Roesch has
optimized KSM by teaching it to retain and use some historical
information from previous scans
- Yosry Ahmed has fixed some inconsistencies in memcg statistics in
the series 'mm: memcg: fix tracking of pending stats updates
values'
- In the series 'Implement IOCTL to get and optionally clear info
about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap
which permits us to atomically read-then-clear page softdirty
state. This is mainly used by CRIU
- Hugh Dickins contributed the series 'shmem,tmpfs: general
maintenance', a bunch of relatively minor maintenance tweaks to
this code
- Matthew Wilcox has increased the use of the VMA lock over
file-backed page faults in the series 'Handle more faults under the
VMA lock'. Some rationalizations of the fault path became possible
as a result
- In the series 'mm/rmap: convert page_move_anon_rmap() to
folio_move_anon_rmap()' David Hildenbrand has implemented some
cleanups and folio conversions
- In the series 'various improvements to the GUP interface' Lorenzo
Stoakes has simplified and improved the GUP interface with an eye
to providing groundwork for future improvements
- Andrey Konovalov has sent along the series 'kasan: assorted fixes
and improvements' which does those things
- Some page allocator maintenance work from Kemeng Shi in the series
'Two minor cleanups to break_down_buddy_pages'
- In thes series 'New selftest for mm' Breno Leitao has developed
another MM self test which tickles a race we had between madvise()
and page faults
- In the series 'Add folio_end_read' Matthew Wilcox provides cleanups
and an optimization to the core pagecache code
- Nhat Pham has added memcg accounting for hugetlb memory in the
series 'hugetlb memcg accounting'
- Cleanups and rationalizations to the pagemap code from Lorenzo
Stoakes, in the series 'Abstract vma_merge() and split_vma()'
- Audra Mitchell has fixed issues in the procfs page_owner code's new
timestamping feature which was causing some misbehaviours. In the
series 'Fix page_owner's use of free timestamps'
- Lorenzo Stoakes has fixed the handling of new mappings of sealed
files in the series 'permit write-sealed memfd read-only shared
mappings'
- Mike Kravetz has optimized the hugetlb vmemmap optimization in the
series 'Batch hugetlb vmemmap modification operations'
- Some buffer_head folio conversions and cleanups from Matthew Wilcox
in the series 'Finish the create_empty_buffers() transition'
- As a page allocator performance optimization Huang Ying has added
automatic tuning to the allocator's per-cpu-pages feature, in the
series 'mm: PCP high auto-tuning'
- Roman Gushchin has contributed the patchset 'mm: improve
performance of accounted kernel memory allocations' which improves
their performance by ~30% as measured by a micro-benchmark
- folio conversions from Kefeng Wang in the series 'mm: convert page
cpupid functions to folios'
- Some kmemleak fixups in Liu Shixin's series 'Some bugfix about
kmemleak'
- Qi Zheng has improved our handling of memoryless nodes by keeping
them off the allocation fallback list. This is done in the series
'handle memoryless nodes more appropriately'
- khugepaged conversions from Vishal Moola in the series 'Some
khugepaged folio conversions'"
[ bcachefs conflicts with the dynamically allocated shrinkers have been
resolved as per Stephen Rothwell in
https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/
with help from Qi Zheng.
The clone3 test filtering conflict was half-arsed by yours truly ]
* tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits)
mm/damon/sysfs: update monitoring target regions for online input commit
mm/damon/sysfs: remove requested targets when online-commit inputs
selftests: add a sanity check for zswap
Documentation: maple_tree: fix word spelling error
mm/vmalloc: fix the unchecked dereference warning in vread_iter()
zswap: export compression failure stats
Documentation: ubsan: drop "the" from article title
mempolicy: migration attempt to match interleave nodes
mempolicy: mmap_lock is not needed while migrating folios
mempolicy: alloc_pages_mpol() for NUMA policy without vma
mm: add page_rmappable_folio() wrapper
mempolicy: remove confusing MPOL_MF_LAZY dead code
mempolicy: mpol_shared_policy_init() without pseudo-vma
mempolicy trivia: use pgoff_t in shared mempolicy tree
mempolicy trivia: slightly more consistent naming
mempolicy trivia: delete those ancient pr_debug()s
mempolicy: fix migrate_pages(2) syscall return nr_failed
kernfs: drop shared NUMA mempolicy hooks
hugetlbfs: drop shared NUMA mempolicy pretence
mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets()
...
|
|
Pull drm updates from Dave Airlie:
"Highlights:
- AMD adds some more upcoming HW platforms
- Intel made Meteorlake stable and started adding Lunarlake
- nouveau has a bunch of display rework in prepartion for the NVIDIA
GSP firmware support
- msm adds a7xx support
- habanalabs has finished migration to accel subsystem
Detail summary:
kernel:
- add initial vmemdup-user-array
core:
- fix platform remove() to return void
- drm_file owner updated to reflect owner
- move size calcs to drm buddy allocator
- let GPUVM build as a module
- allow variable number of run-queues in scheduler
edid:
- handle bad h/v sync_end in EDIDs
panfrost:
- add Boris as maintainer
fbdev:
- use fb_ops helpers more
- only allow logo use from fbcon
- rename fb_pgproto to pgprot_framebuffer
- add HPD state to drm_connector_oob_hotplug_event
- convert to fbdev i/o mem helpers
i915:
- Enable meteorlake by default
- Early Xe2 LPD/Lunarlake display enablement
- Rework subplatforms into IP version checks
- GuC based TLB invalidation for Meteorlake
- Display rework for future Xe driver integration
- LNL FBC features
- LNL display feature capability reads
- update recommended fw versions for DG2+
- drop fastboot module parameter
- added deviceid for Arrowlake-S
- drop preproduction workarounds
- don't disable preemption for resets
- cleanup inlines in headers
- PXP firmware loading fix
- Fix sg list lengths
- DSC PPS state readout/verification
- Add more RPL P/U PCI IDs
- Add new DG2-G12 stepping
- DP enhanced framing support to state checker
- Improve shared link bandwidth management
- stop using GEM macros in display code
- refactor related code into display code
- locally enable W=1 warnings
- remove PSR watchdog timers on LNL
amdgpu:
- RAS/FRU EEPROM updatse
- IP discovery updatses
- GC 11.5 support
- DCN 3.5 support
- VPE 6.1 support
- NBIO 7.11 support
- DML2 support
- lots of IP updates
- use flexible arrays for bo list handling
- W=1 fixes
- Enable seamless boot in more cases
- Enable context type property for HDMI
- Rework GPUVM TLB flushing
- VCN IB start/size alignment fixes
amdkfd:
- GC 10/11 fixes
- GC 11.5 support
- use partial migration in GPU faults
radeon:
- W=1 Fixes
- fix some possible buffer overflow/NULL derefs
nouveau:
- update uapi for NO_PREFETCH
- scheduler/fence fixes
- rework suspend/resume for GSP-RM
- rework display in preparation for GSP-RM
habanalabs:
- uapi: expose tsc clock
- uapi: block access to eventfd through control device
- uapi: force dma-buf export to PAGE_SIZE alignments
- complete move to accel subsystem
- move firmware interface include files
- perform hard reset on PCIe AXI drain event
- optimise user interrupt handling
msm:
- DP: use existing helpers for DPCD
- DPU: interrupts reworked
- gpu: a7xx (a730/a740) support
- decouple msm_drv from kms for headless devices
mediatek:
- MT8188 dsi/dp/edp support
- DDP GAMMA - 12 bit LUT support
- connector dynamic selection capability
rockchip:
- rv1126 mipi-dsi/vop support
- add planar formats
ast:
- rename constants
panels:
- Mitsubishi AA084XE01
- JDI LPM102A188A
- LTK050H3148W-CTA6
ivpu:
- power management fixes
qaic:
- add detach slice bo api
komeda:
- add NV12 writeback
tegra:
- support NVSYNC/NHSYNC
- host1x suspend fixes
ili9882t:
- separate into own driver"
* tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drm: (1803 commits)
drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo
drm/amdgpu: Remove duplicate fdinfo fields
drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded
drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems
drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table
drm/amdgpu: Identify data parity error corrected in replay mode
drm/amdgpu: Fix typo in IP discovery parsing
drm/amd/display: fix S/G display enablement
drm/amdxcp: fix amdxcp unloads incompletely
drm/amd/amdgpu: fix the GPU power print error in pm info
drm/amdgpu: Use pcie domain of xcc acpi objects
drm/amd: check num of link levels when update pcie param
drm/amdgpu: Add a read to GFX v9.4.3 ring test
drm/amd/pm: call smu_cmn_get_smc_version in is_mode1_reset_supported.
drm/amdgpu: get RAS poison status from DF v4_6_2
drm/amdgpu: Use discovery table's subrevision
drm/amd/display: 3.2.256
drm/amd/display: add interface to query SubVP status
drm/amd/display: Read before writing Backlight Mode Set Register
drm/amd/display: Disable SYMCLK32_SE RCO on DCN314
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Info Molnar:
"Futex improvements:
- Add the 'futex2' syscall ABI, which is an attempt to get away from
the multiplex syscall and adds a little room for extentions, while
lifting some limitations.
- Fix futex PI recursive rt_mutex waiter state bug
- Fix inter-process shared futexes on no-MMU systems
- Use folios instead of pages
Micro-optimizations of locking primitives:
- Improve arch_spin_value_unlocked() on asm-generic ticket spinlock
architectures, to improve lockref code generation
- Improve the x86-32 lockref_get_not_zero() main loop by adding
build-time CMPXCHG8B support detection for the relevant lockref
code, and by better interfacing the CMPXCHG8B assembly code with
the compiler
- Introduce arch_sync_try_cmpxchg() on x86 to improve
sync_try_cmpxchg() code generation. Convert some sync_cmpxchg()
users to sync_try_cmpxchg().
- Micro-optimize rcuref_put_slowpath()
Locking debuggability improvements:
- Improve CONFIG_DEBUG_RT_MUTEXES=y to have a fast-path as well
- Enforce atomicity of sched_submit_work(), which is de-facto atomic
but was un-enforced previously.
- Extend <linux/cleanup.h>'s no_free_ptr() with __must_check
semantics
- Fix ww_mutex self-tests
- Clean up const-propagation in <linux/seqlock.h> and simplify the
API-instantiation macros a bit
RT locking improvements:
- Provide the rt_mutex_*_schedule() primitives/helpers and use them
in the rtmutex code to avoid recursion vs. rtlock on the PI state.
- Add nested blocking lockdep asserts to rt_mutex_lock(),
rtlock_lock() and rwbase_read_lock()
.. plus misc fixes & cleanups"
* tag 'locking-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
futex: Don't include process MM in futex key on no-MMU
locking/seqlock: Fix grammar in comment
alpha: Fix up new futex syscall numbers
locking/seqlock: Propagate 'const' pointers within read-only methods, remove forced type casts
locking/lockdep: Fix string sizing bug that triggers a format-truncation compiler-warning
locking/seqlock: Change __seqprop() to return the function pointer
locking/seqlock: Simplify SEQCOUNT_LOCKNAME()
locking/atomics: Use atomic_try_cmpxchg_release() to micro-optimize rcuref_put_slowpath()
locking/atomic, xen: Use sync_try_cmpxchg() instead of sync_cmpxchg()
locking/atomic/x86: Introduce arch_sync_try_cmpxchg()
locking/atomic: Add generic support for sync_try_cmpxchg() and its fallback
locking/seqlock: Fix typo in comment
futex/requeue: Remove unnecessary ‘NULL’ initialization from futex_proxy_trylock_atomic()
locking/local, arch: Rewrite local_add_unless() as a static inline function
locking/debug: Fix debugfs API return value checks to use IS_ERR()
locking/ww_mutex/test: Make sure we bail out instead of livelock
locking/ww_mutex/test: Fix potential workqueue corruption
locking/ww_mutex/test: Use prng instead of rng to avoid hangs at bootup
futex: Add sys_futex_requeue()
futex: Add flags2 argument to futex_requeue()
...
|
|
Merge our KVM topic branch, this has been independently included in linux-next
for most of the development cycle.
|
|
patch_instruction() entails setting up pte, patching the instruction,
clearing the pte and flushing the tlb. If multiple instructions need
to be patched, every instruction would have to go through the above
drill unnecessarily. Instead, introduce patch_instructions() function
that sets up the pte, clears the pte and flushes the tlb only once
per page range of instructions to be patched. Duplicate most of the
patch_instruction() code instead of merging with it, to avoid the
performance degradation observed on ppc32, for patch_instruction(),
with the code path merged. Also, setup poking_init() always as BPF
expects poking_init() to be setup even when STRICT_KERNEL_RWX is off.
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231020141358.643575-2-hbathini@linux.ibm.com
|
|
There's a single call to local_flush_tlb_page_psize() in the code
patching code. That call is never executed on 32-bit Book3S,
because it's guarded by mm_patch_enabled() which is essentially a
radix_enabled() check, which is always false on 32s.
However depending on how the optimiser sees things it may still trip
over the BUILD_BUG() in the 32s stub of local_flush_tlb_page_psize().
To avoid that, implement it in terms of flush_range() so that if it ever
becomes called it should function, even if not optimally.
Note that flush_range() deals with page aligning the address and so on,
and that 32s doesn't support huge pages so there should be no issue with
non-standard page sizes needing to be flushed.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231023092319.1507325-1-mpe@ellerman.id.au
|
|
This is needed to add the msm pr which is based on a higher base.
Signed-off-by: Dave Airlie <airlied@redhat.com>
|
|
Since commit d3c511ac1d72 ("powerpc/cpm: Remove
!CONFIG_PPC_CPM_NEW_BINDING code") cpm_dp...() macros have no added
value anymore.
Last user of those macros were fixed by commit 5e6cb39a256d ("net:
fs_enet: Use cpm_muram_xxx() functions instead of cpm_dpxxx() macros")
Remove them.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/3aaa40bf706afeab8fe9a74b8437704a4269a6a2.1697005615.git.christophe.leroy@csgroup.eu
|
|
When creating a kprobe on function entry through tracefs, enable
arguments to be recorded to be specified using $argN syntax.
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230614085926.2176641-1-naveen@kernel.org
|
|
PowerVM Hypervisor dispatches on a whole core basis. In a shared LPAR, a
CPU from a core that is CEDED or preempted may have a larger latency. In
such a scenario, its preferable to choose a different CPU to run.
If one of the CPUs in the core is active, i.e neither CEDED nor
preempted, then consider this CPU as not preempted.
Also if any of the CPUs in the core has yielded but OS has not requested
CEDE or CONFER, then consider this CPU to be preempted.
Correct detection of preempted CPUs is important for detecting idle
CPUs/cores in task scheduler.
Tested-by: Aboorva Devarajan <aboorvad@linux.vnet.ibm.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231019091452.95260-1-srikar@linux.vnet.ibm.com
|
|
virt_to_phys() doesn't need the address pointer to be mutable.
At the same time allowing it to be mutable leads to the following build
warning for constant pointers:
warning: passing argument 1 of ‘virt_to_phys’ discards ‘const’ qualifier from pointer target type
Signed-off-by: Stanislav Kinsburskii <stanislav.kinsburskii@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/168155747391.13678.10634415747614468991.stgit@skinsburskii.localdomain
|
|
Sparse reports dereference of a __user pointer. copy_mc_to_user() takes
a __user pointer, verifies it, then calls the generic copy routine
copy_mc_generic().
As we have verified the pointer, cast out the __user annotation when
passing to copy_mc_generic().
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-10-bgray@linux.ibm.com
|
|
Sparse reports an endian mismatch with args to opal_int_get_xirr().
Checking the skiboot source[1] shows the function takes a __be32*
(as expected), so update the function declaration to reflect this.
[1]: https://github.com/open-power/skiboot/blob/80e2b1dc73/hw/xive.c#L3479
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-9-bgray@linux.ibm.com
|
|
Sparse reports endianness mismatches in the KVM shared regs getter and
setter helpers.
This code has dynamic endianness behind a safe interface, so a force is
warranted here to tell sparse this is OK.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-8-bgray@linux.ibm.com
|
|
Sparse reports several endianness warnings on variables and functions
that are consistently treated as big endian. There are no
multi-endianness shenanigans going on here so fix these low hanging
fruit up in one patch.
All changes are just type annotations; no endianness switching
operations are introduced by this patch.
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20231011053711.93427-7-bgray@linux.ibm.com
|
|
Introduce PAGE_EXECONLY_X macro which provides exec-only rights.
The _X may be seen as redundant with the EXECONLY but it helps
keep consistency, all macros having the EXEC right have _X.
And put it next to PAGE_NONE as PAGE_EXECONLY_X is
somehow PAGE_NONE + EXEC just like all other SOMETHING_X are
just SOMETHING + EXEC.
On book3s/64 PAGE_EXECONLY becomes PAGE_READONLY_X.
On book3s/64, as PAGE_EXECONLY is only valid for Radix add
VM_READ flag in vm_get_page_prot() for non-Radix.
And update access_error() so that a non exec fault on a VM_EXEC only
mapping is always invalid, even when the underlying layer don't
always generate a fault for that.
For 8xx, set PAGE_EXECONLY_X as _PAGE_NA | _PAGE_EXEC.
For others, only set it as just _PAGE_EXEC
With that change, 8xx, e500 and 44x fully honor execute-only
protection.
On 40x that is a partial implementation of execute-only. The
implementation won't be complete because once a TLB has been loaded
via the Instruction TLB miss handler, it will be possible to read
the page. But at least it can't be read unless it is executed first.
On 603 MMU, TLB missed are handled by SW and there are separate
DTLB and ITLB. Execute-only is therefore now supported by not loading
DTLB when read access is not permitted.
On hash (604) MMU it is more tricky because hash table is common to
load/store and execute. Nevertheless it is still possible to check
whether _PAGE_READ is set before loading hash table for a load/store
access. At least it can't be read unless it is executed first.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/4283ea9cbef9ff2fbee468904800e1962bc8fc18.1695659959.git.christophe.leroy@csgroup.eu
|
|
_PAGE_USER is now gone on all targets. Remove it completely.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/76ebe74fdaed4297a1d8203a61174650c1d8d278.1695659959.git.christophe.leroy@csgroup.eu
|
|
On 603 MMU, TLB missed are handled by SW and there are separated
DTLB and ITLB. It is therefore possible to implement execute-only
protection by not loading DTLB when read access is not permitted.
To do that, _PAGE_READ flag is needed but there is no bit available
for it in PTE. On the other hand the only real use of _PAGE_USER is
to implement PAGE_NONE by clearing _PAGE_USER.
As _PAGE_NONE can also be implemented by clearing _PAGE_READ, remove
_PAGE_USER and add _PAGE_READ. Then use the virtual address to know
whether user rights or kernel rights are to be used.
With that change, 603 MMU now honors execute-only protection.
For hash (604) MMU it is more tricky because hash table is common to
load/store and execute. Nevertheless it is still possible to check
whether _PAGE_READ is set before loading hash table for a load/store
access. At least it can't be read unless it is executed first.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/b7702dd5a041ec59055ed2880f4952e94c087a2e.1695659959.git.christophe.leroy@csgroup.eu
|
|
Several places, _PAGE_RW maps to write permission and don't
always imply read. To make it more clear, do as book3s/64 in
commit c7d54842deb1 ("powerpc/mm: Use _PAGE_READ to indicate
Read access") and use _PAGE_WRITE when more relevant.
For the time being _PAGE_WRITE is equivalent to _PAGE_RW but that
will change when _PAGE_READ gets added in following patches.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/5798782869fe4d2698f104948dabd17657b89395.1695659959.git.christophe.leroy@csgroup.eu
|
|
_PAGE_USER is used to select the zone. Today zone 0 is kernel
and zone 1 is user.
To implement _PAGE_NONE, _PAGE_USER is cleared, leading to no access
for user but kernel still has access to the page so it's possible for
a user application to write in that page by using a kernel function
as trampoline.
What is really wanted is to have user rights on pages below TASK_SIZE
and no user rights on pages above TASK_SIZE. Use zones for that.
There are 16 zones so lets use the 4 upper address bits to set the
zone and declare zone rights based on TASK_SIZE.
Then drop _PAGE_USER and reuse it as _PAGE_READ that will be checked
in Data TLB miss handler. That will properly handle PAGE_NONE for
both kernel and user.
In addition, it partially implements execute-only right. The
implementation won't be complete because once a TLB has been loaded
via the Instruction TLB miss handler, it will be possible to read
the page. But at least it can't be read unless it is executed first.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/2a13e3ba8a5dec43143cc1f9a91ec71ea1529f3c.1695659959.git.christophe.leroy@csgroup.eu
|
|
44x MMU has 6 page protection bits:
- R, W, X for supervisor
- R, W, X for user
It means that it can support X without R.
To do that, _PAGE_READ flag is needed but there is no bit available
for it in PTE. On the other hand the only real use of _PAGE_USER is
to implement PAGE_NONE by clearing _PAGE_USER.
As _PAGE_NONE can also be implemented by clearing _PAGE_READ,
remove _PAGE_USER and add _PAGE_READ. In order to insert bits in
one go during TLB miss, move _PAGE_ACCESSED and put _PAGE_READ
just after _PAGE_DIRTY so that _PAGE_DIRTY is copied into SW and
_PAGE_READ into SR at once.
With that change, 44x now also honors execute-only protection.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/043e17987b260b99b45094138c6cb2e89e63d499.1695659959.git.christophe.leroy@csgroup.eu
|
|
e500 MMU has 6 page protection bits:
- R, W, X for supervisor
- R, W, X for user
It means that it can support X without R.
To do that, _PAGE_READ flag is needed.
With 32 bits PTE there is no bit available for it in PTE. On the
other hand the only real use of _PAGE_USER is to implement PAGE_NONE
by clearing _PAGE_USER. As _PAGE_NONE can also be implemented by
clearing _PAGE_READ, remove _PAGE_USER and add _PAGE_READ. Move
_PAGE_PRESENT into bit 30 so that _PAGE_READ can match SR bit.
With 64 bits PTE _PAGE_USER is already the combination of SR and UR
so all we need to do is to rename it _PAGE_READ.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/0849ab6bf7ae2af23f94b0457fa40d0ea3983fe4.1695659959.git.christophe.leroy@csgroup.eu
|