Age | Commit message (Collapse) | Author |
|
Replace an of_get_property() call by of_property_match_string()
so that this function implementation can be simplified.
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Link: https://lore.kernel.org/linuxppc-dev/d9bdc1b6-ea7e-47aa-80aa-02ae649abf72@csgroup.eu/
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/linuxppc-dev/87cyk97ufp.fsf@mail.lhotse/
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/ede25e03-7a14-4787-ae1b-4fc9290add5a@web.de
|
|
Commit 384e338a9187 ("powerpc: drop MPC8540_ADS and MPC8560_ADS platform
support") and commit b751ed04bc5e ("powerpc: drop MPC85xx_CDS platform
support") removes the platform support for MPC8540_ADS, MPC8560_ADS and
MPC85xx_CDS in the source tree, but misses to remove the config options in
the Kconfig file. Hence, these three config options are without any effect
since then.
Drop these three dead config options.
Fixes: 384e338a9187 ("powerpc: drop MPC8540_ADS and MPC8560_ADS platform support")
Fixes: b751ed04bc5e ("powerpc: drop MPC85xx_CDS platform support")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20240927095203.392365-1-lukas.bulwahn@redhat.com
|
|
Replace `cpumask_any_and(a, b) >= nr_cpu_ids`
with the more readable `!cpumask_intersects(a, b)`.
Comparison between cpumask_any_and() and cpumask_intersects()
The cpumask_any_and() function expands using FIND_FIRST_BIT(),
resulting in a loop that iterates through each bit of the bitmask:
for (idx = 0; idx * BITS_PER_LONG < sz; idx++) {
val = (FETCH);
if (val) {
sz = min(idx * BITS_PER_LONG + __ffs(MUNGE(val)), sz);
break;
}
}
The cpumask_intersects() function expands using __bitmap_intersects(),
resulting in that the first loop iterates through each long word of the bitmask,
and the second through each bit within a long word:
unsigned int k, lim = bits/BITS_PER_LONG;
for (k = 0; k < lim; ++k)
if (bitmap1[k] & bitmap2[k])
return true;
if (bits % BITS_PER_LONG)
if ((bitmap1[k] & bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
return true;
Conclusion: cpumask_intersects() is at least as efficient as cpumask_any_and(),
if not more so, as it typically performs fewer loops and comparisons.
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20240926092623.399577-2-costa.shul@redhat.com
|
|
Currently this cannot lookup symbol beyond 64 characters in some cases
like "ls", "lp" and "t"
Fix this by using KSYM_NAME_LEN instead of fixed 64 characters
Signed-off-by: Mukesh Kumar Chaurasiya <mchauras@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241024191232.1570894-2-mchauras@linux.ibm.com
|
|
The correct format string for resource_size_t is %pa which
acts on the address of the variable to be formatted [1].
[1] https://elixir.bootlin.com/linux/v6.11.3/source/Documentation/core-api/printk-formats.rst#L229
Introduced by commit 9d9326d3bc0e ("phy: Change mii_bus id field to a string")
Flagged by gcc-14 as:
arch/powerpc/platforms/82xx/ep8248e.c: In function 'ep8248e_mdio_probe':
arch/powerpc/platforms/82xx/ep8248e.c:131:46: warning: format '%x' expects argument of type 'unsigned int', but argument 4 has type 'resource_size_t' {aka 'long long unsigned int'} [-Wformat=]
131 | snprintf(bus->id, MII_BUS_ID_SIZE, "%x", res.start);
| ~^ ~~~~~~~~~
| | |
| | resource_size_t {aka long long unsigned int}
| unsigned int
| %llx
No functional change intended.
Compile tested only.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/netdev/711d7f6d-b785-7560-f4dc-c6aad2cce99@linux-m68k.org/
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241014-ep8248e-pa-fmt-v1-1-009ea0dcc18f@kernel.org
|
|
Reorganize kerneldoc parameter names to match the parameter
order in the function header.
Problems identified using Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20240930112121.95324-12-Julia.Lawall@inria.fr
|
|
Fix typo in the following kvm function names from:
kmvhv_counters_tracepoint_regfunc -> kvmhv_counters_tracepoint_regfunc
kmvhv_counters_tracepoint_unregfunc -> kvmhv_counters_tracepoint_unregfunc
Fixes: e1f288d2f9c6 ("KVM: PPC: Book3S HV nestedv2: Add support for reading VPA counters for pseries guests")
Reported-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241114085020.1147912-1-kjain@linux.ibm.com
|
|
Make sure PowerPC uses the arch-specific function now that those have
been reorganized.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://lore.kernel.org/r/20241113190156.2145593-4-coltonlewis@google.com
|
|
For clarity, rename the arch-specific definitions of these functions
to perf_arch_* to denote they are arch-specifc. Define the
generic-named functions in one place where they can call the
arch-specific ones as needed.
Signed-off-by: Colton Lewis <coltonlewis@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20241113190156.2145593-3-coltonlewis@google.com
|
|
These functions are not used outside of sstep.c
Fixes: 350779a29f11 ("powerpc: Handle most loads and stores in instruction emulation code")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241001130356.14664-1-msuchanek@suse.de
|
|
Commit 6398326b9ba1 ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes")
dropped the use of vcore->dpdes for msgsndp / SMT emulation. Prior to that
commit, the below code at L1 level (see [1] for terminology) was
responsible for setting vc->dpdes for the respective L2 vCPU:
if (!nested) {
kvmppc_core_prepare_to_enter(vcpu);
if (vcpu->arch.doorbell_request) {
vc->dpdes = 1;
smp_wmb();
vcpu->arch.doorbell_request = 0;
}
L1 then sent vc->dpdes to L0 via kvmhv_save_hv_regs(), and while
servicing H_ENTER_NESTED at L0, the below condition at L0 level made sure
to abort and go back to L1 if vcpu->arch.doorbell_request = 1 so that L1
sets vc->dpdes as per above if condition:
} else if (vcpu->arch.pending_exceptions ||
vcpu->arch.doorbell_request ||
xive_interrupt_pending(vcpu)) {
vcpu->arch.ret = RESUME_HOST;
goto out;
}
This worked fine since vcpu->arch.doorbell_request was used more like a
flag and vc->dpdes was used to pass around the doorbell state. But after
Commit 6398326b9ba1 ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes"),
vcpu->arch.doorbell_request is the only variable used to pass around
doorbell state.
With the plumbing for handling doorbells for nested guests updated to use
vcpu->arch.doorbell_request over vc->dpdes, the above "else if" stops
doorbells from working correctly as L0 aborts execution of L2 and
instead goes back to L1.
Remove vcpu->arch.doorbell_request from the above "else if" condition as
it is no longer needed for L0 to correctly handle the doorbell status
while running L2.
[1] Terminology
1. L0 : PowerNV linux running with HV privileges
2. L1 : Pseries KVM guest running on top of L0
2. L2 : Nested KVM guest running on top of L1
Fixes: 6398326b9ba1 ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes")
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241109063301.105289-4-gautam@linux.ibm.com
|
|
commit 6398326b9ba1 ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes")
introduced an optimization to use only vcpu->doorbell_request for SMT
emulation for Power9 and above guests, but the code for nested guests
still relies on the old way of handling doorbells, due to which an L2
guest (see [1]) cannot be booted with XICS with SMT>1. The command to
repro this issue is:
// To be run in L1
qemu-system-ppc64 \
-drive file=rhel.qcow2,format=qcow2 \
-m 20G \
-smp 8,cores=1,threads=8 \
-cpu host \
-nographic \
-machine pseries,ic-mode=xics -accel kvm
Fix the plumbing to utilize vcpu->doorbell_request instead of vcore->dpdes
for nested KVM guests on P9 and above.
[1] Terminology
1. L0 : PowerNV linux running with HV privileges
2. L1 : Pseries KVM guest running on top of L0
2. L2 : Nested KVM guest running on top of L1
Fixes: 6398326b9ba1 ("KVM: PPC: Book3S HV P9: Stop using vc->dpdes")
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241109063301.105289-3-gautam@linux.ibm.com
|
|
This reverts commit 7c3ded5735141ff4d049747c9f76672a8b737c49.
On PowerNV, when a nested guest tries to use a feature prohibited by
HFSCR, the nested hypervisor (L1) should get a H_FAC_UNAVAILABLE trap
so that L1 can emulate the feature. But with the change introduced by
commit 7c3ded573514 ("KVM: PPC: Book3S HV Nested: Stop forwarding all HFUs
to L1") the L1 ends up getting a H_EMUL_ASSIST because of which, the L1
ends up injecting a SIGILL when L2 (nested guest) tries to use doorbells.
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241109063301.105289-2-gautam@linux.ibm.com
|
|
These offsets are not used anymore, delete them.
Fixes: c39b1dcf055d ("powerpc/vdso: Add a page for non-time data")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241113-vdso-powerpc-asm-offsets-v1-1-3f7e589f090d@linutronix.de
|
|
- Drop obsolete references to PPC970 KVM, which was removed 10 years ago.
- Fix incorrect references to non-existing ioctls
- List registers supported by KVM_GET/SET_ONE_REG on s390
- Use rST internal links
- Reorganize the introduction to the API document
|
|
spu_priv1_beat_ops were removed in commit bf4981a00636 ("powerpc: Remove
the celleb support"), remove the unneeded extern.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241112114805.453894-1-mpe@ellerman.id.au
|
|
Remove the use of contractions and use proper punctuation in #error
directive messages that discourage the direct inclusion of header files.
Link: https://lkml.kernel.org/r/20241105032231.28833-1-natanielfarzan@gmail.com
Signed-off-by: Nataniel Farzan <natanielfarzan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
... if they are identical to fallbacks, just leave them alone.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
On a system with n CPUs and m interrupts, there will be n*m decimal
values yielded via seq_printf(.."%10u "..) which is less efficient
than seq_put_decimal_ull_width(), stress reading /proc/interrupts
indicates ~30% performance improvement with this patch.
Signed-off-by: David Wang <00107082@163.com>
[mpe: Flesh out change log based on original submission]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/all/20241103080552.4787-1-00107082@163.com
Link: https://patch.msgid.link/20241108162327.9887-1-00107082@163.com
|
|
As per the kernel documentation[1], hardlockup detector should
be disabled in KVM guests as it may give false positives. On
PPC, hardlockup detector is enabled inside KVM guests because
disable_hardlockup_detector() is marked as early_initcall and it
relies on kvm_guest static key (is_kvm_guest()) which is initialized
later during boot by check_kvm_guest(), which is a core_initcall.
check_kvm_guest() is also called in pSeries_smp_probe(), which is called
before initcalls, but it is skipped if KVM guest does not have doorbell
support or if the guest is launched with SMT=1.
Call check_kvm_guest() in disable_hardlockup_detector() so that
is_kvm_guest() check goes through fine and hardlockup detector can be
disabled inside the KVM guest.
[1]: Documentation/admin-guide/sysctl/kernel.rst
Fixes: 633c8e9800f3 ("powerpc/pseries: Enable hardlockup watchdog for PowerVM partitions")
Cc: stable@vger.kernel.org # v5.14+
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241108094839.33084-1-gautam@linux.ibm.com
|
|
The param area is a memory region where the kernel places additional
command-line arguments for fadump kernel. Currently, the param memory
area is reserved in fadump kernel if it is above boot_mem_top. However,
it should be reserved if it is below boot_mem_top because the fadump
kernel already reserves memory from boot_mem_top to the end of DRAM.
Currently, there is no impact from not reserving param memory if it is
below boot_mem_top, as it is not used after the early boot phase of the
fadump kernel. However, if this changes in the future, it could lead to
issues in the fadump kernel.
Fixes: 3416c9daa6b1 ("powerpc/fadump: pass additional parameters when fadump is active")
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241107055817.489795-2-sourabhjain@linux.ibm.com
|
|
Memory for passing additional parameters to fadump capture kernel
is allocated during subsys_initcall level, using memblock. But
as slab is already available by this time, allocation happens via
the buddy allocator. This may work for radix MMU but is likely to
fail in most cases for hash MMU as hash MMU needs this memory in
the first memory block for it to be accessible in real mode in the
capture kernel (second boot). So, allocate memory for additional
parameters area as soon as MMU mode is obvious.
Fixes: 683eab94da75 ("powerpc/fadump: setup additional parameters for dump capture kernel")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com>
Closes: https://lore.kernel.org/lkml/a70e4064-a040-447b-8556-1fd02f19383d@linux.vnet.ibm.com/T/#u
Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241107055817.489795-1-sourabhjain@linux.ibm.com
|
|
Booting a KASAN=y kernel with the recently added ftrace out-of-line
support causes a warning at boot:
------------[ cut here ]------------
Stub index overflow (1729 > 1728)
WARNING: CPU: 0 PID: 0 at arch/powerpc/kernel/trace/ftrace.c:209 ftrace_init_nop+0x408/0x444
...
NIP ftrace_init_nop+0x408/0x444
LR ftrace_init_nop+0x404/0x444
Call Trace:
ftrace_init_nop+0x404/0x444 (unreliable)
ftrace_process_locs+0x544/0x8a0
ftrace_init+0xb4/0x22c
start_kernel+0x1dc/0x4d4
start_here_common+0x1c/0x20
...
ftrace failed to modify
[<c0000000030beddc>] _sub_I_65535_1+0x8/0x3c
actual: 00:00:00:60
Initializing ftrace call sites
ftrace record flags: 0
(0)
expected tramp: c00000000008b418
------------[ cut here ]------------
The function in question, _sub_I_65535_1 is some sort of trampoline
generated for KASAN, and is in the .text.startup section. That section
is part of INIT_TEXT, meaning is_kernel_inittext() returns true for it.
But the script that determines how many out-of-line ftrace stubs are
needed isn't doesn't consider .text.startup as inittext, leading to
there not being enough space for the init stubs.
Conversely the logic to calculate how many stubs are needed for the text
section isn't filtering out the symbols in .text.startup and so ends up
over counting.
Fix both problems by calculating the total number of stubs first, then
the number that count as inittext, and then subtract the latter from the
former to get the count for the text section.
Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241107111630.31068-1-mpe@ellerman.id.au
|
|
KVM/riscv changes for 6.13
- Accelerate KVM RISC-V when running as a guest
- Perf support to collect KVM guest statistics from host side
|
|
This was only needed for PPC970 support, which is long gone: the
implementation was removed in 2014.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20241023124507.280382-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Simplify the cell_iommu_get_fixed_address() dma-ranges parsing to use
the for_each_of_range() iterator.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241106212647.341857-1-robh@kernel.org
|
|
Simplify the ppc44x PCI dma-ranges parsing to use the
for_each_of_range() iterator.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241106212640.341677-1-robh@kernel.org
|
|
Several architectures support text patching, but they name the header
files that declare patching functions differently.
Make all such headers consistently named text-patching.h and add an empty
header in asm-generic for architectures that do not support text patching.
Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: kdevops <kdevops@lists.linux.dev>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Song Liu <song@kernel.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Hugetlb mappings are now handled through normal channels just like any
other mapping, so we no longer need hugetlb_get_unmapped_area* specific
functions.
Link: https://lkml.kernel.org/r/20241007075037.267650-8-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb mappings
We want to stop special casing hugetlb mappings and make them go through
generic channels, so teach arch_get_unmapped_area{_topdown} to handle
those.
Reshuffle file_to_psize() definition so arch_get_unmapped_area{_topdown}
can make use of it.
Link: https://lkml.kernel.org/r/20241007075037.267650-6-osalvador@suse.de
Signed-off-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
ps3_setup_uhc_device() is only called from ps3_setup_ehci_device() and
ps3_setup_ohci_device(), which are both marked __init. Hence replace
the former's __ref marker by __init.
Note that before commit bd721ea73e1f9655 ("treewide: replace obsolete
_refok by __ref"), the function was marked __init_refok, which probably
should have been __init in the first place.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/31fe9435056fcfbf82c3a01693be278d5ce4ad0f.1730899557.git.geert+renesas@glider.be
|
|
Add the four syscalls setxattrat(), getxattrat(), listxattrat() and
removexattrat(). Those can be used to operate on extended attributes,
especially security related ones, either relative to a pinned directory
or on a file descriptor without read access, avoiding a
/proc/<pid>/fd/<fd> detour, requiring a mounted procfs.
One use case will be setfiles(8) setting SELinux file contexts
("security.selinux") without race conditions and without a file
descriptor opened with read access requiring SELinux read permission.
Use the do_{name}at() pattern from fs/open.c.
Pass the value of the extended attribute, its length, and for
setxattrat(2) the command (XATTR_CREATE or XATTR_REPLACE) via an added
struct xattr_args to not exceed six syscall arguments and not
merging the AT_* and XATTR_* flags.
[AV: fixes by Christian Brauner folded in, the entire thing rebased on
top of {filename,file}_...xattr() primitives, treatment of empty
pathnames regularized. As the result, AT_EMPTY_PATH+NULL handling
is cheap, so f...(2) can use it]
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Link: https://lore.kernel.org/r/20240426162042.191916-1-cgoettsche@seltendoof.de
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
CC: x86@kernel.org
CC: linux-alpha@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux-arm-kernel@lists.infradead.org
CC: linux-ia64@vger.kernel.org
CC: linux-m68k@lists.linux-m68k.org
CC: linux-mips@vger.kernel.org
CC: linux-parisc@vger.kernel.org
CC: linuxppc-dev@lists.ozlabs.org
CC: linux-s390@vger.kernel.org
CC: linux-sh@vger.kernel.org
CC: sparclinux@vger.kernel.org
CC: linux-fsdevel@vger.kernel.org
CC: audit@vger.kernel.org
CC: linux-arch@vger.kernel.org
CC: linux-api@vger.kernel.org
CC: linux-security-module@vger.kernel.org
CC: selinux@vger.kernel.org
[brauner: slight tweaks]
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
After the following powerpc commits, all calls to set_memory_...()
functions check returned value.
- Commit 8f17bd2f4196 ("powerpc: Handle error in mark_rodata_ro() and
mark_initmem_nx()")
- Commit f7f18e30b468 ("powerpc/kprobes: Handle error returned by
set_memory_rox()")
- Commit 009cf11d4aab ("powerpc: Don't ignore errors from
set_memory_{n}p() in __kernel_map_pages()")
- Commit 9cbacb834b4a ("powerpc: Don't ignore errors from
set_memory_{n}p() in __kernel_map_pages()")
- Commit 78cb0945f714 ("powerpc: Handle error in mark_rodata_ro() and
mark_initmem_nx()")
All calls in core parts of the kernel also always check returned value,
can be looked at with following query:
$ git grep -w -e set_memory_ro -e set_memory_rw -e set_memory_x -e set_memory_nx -e set_memory_rox `find . -maxdepth 1 -type d | grep -v arch | grep /`
It is now possible to flag those functions with __must_check to make
sure no new unchecked call it added.
Link: https://github.com/KSPP/linux/issues/7
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/775dae48064a661554802ed24ed5bdffe1784724.1725723351.git.christophe.leroy@csgroup.eu
|
|
spurious interrupts
Running a L2 vCPU (see [1] for terminology) with LPCR_MER bit set and no
pending interrupts results in that L2 vCPU getting an infinite flood of
spurious interrupts. The 'if check' in kvmhv_run_single_vcpu() sets the
LPCR_MER bit if there are pending interrupts.
The spurious flood problem can be observed in 2 cases:
1. Crashing the guest while interrupt heavy workload is running
a. Start a L2 guest and run an interrupt heavy workload (eg: ipistorm)
b. While the workload is running, crash the guest (make sure kdump
is configured)
c. Any one of the vCPUs of the guest will start getting an infinite
flood of spurious interrupts.
2. Running LTP stress tests in multiple guests at the same time
a. Start 4 L2 guests.
b. Start running LTP stress tests on all 4 guests at same time.
c. In some time, any one/more of the vCPUs of any of the guests will
start getting an infinite flood of spurious interrupts.
The root cause of both the above issues is the same:
1. A NMI is sent to a running vCPU that has LPCR_MER bit set.
2. In the NMI path, all registers are refreshed, i.e, H_GUEST_GET_STATE
is called for all the registers.
3. When H_GUEST_GET_STATE is called for LPCR, the vcpu->arch.vcore->lpcr
of that vCPU at L1 level gets updated with LPCR_MER set to 1, and this
new value is always used whenever that vCPU runs, regardless of whether
there was a pending interrupt.
4. Since LPCR_MER is set, the vCPU in L2 always jumps to the external
interrupt handler, and this cycle never ends.
Fix the spurious flood by masking off the LPCR_MER bit before running a
L2 vCPU to ensure that it is not set if there are no pending interrupts.
[1] Terminology:
1. L0 : PAPR hypervisor running in HV mode
2. L1 : Linux guest (logical partition) running on top of L0
3. L2 : KVM guest running on top of L1
Fixes: ec0f6639fa88 ("KVM: PPC: Book3S HV nestedv2: Ensure LPCR_MER bit is passed to the L0")
Cc: stable@vger.kernel.org # v6.8+
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
|
|
In assert_pte_locked(), we just get the ptl and assert if it was already
held, so convert it to using pte_offset_map_ro_nolock().
Link: https://lkml.kernel.org/r/42559e042eb6fc3129a40f710d671712030646b4.1727332572.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The Power11 architected and raw mode support in Linux was merged in commit
c2ed087ed35c ("powerpc: Add Power11 architected and raw mode"), and the
corresponding support in QEMU is pending in [1], which is currently in
its V6.
Currently, booting a KVM guest inside a pseries LPAR (Logical Partition)
on a kernel without P11 support results the guest boot in a Power10
compatibility mode (i.e., with logical PVR of Power10). However, booting
a KVM guest on a kernel with P11 support causes the following boot crash.
On a Power11 LPAR, the Power Hypervisor (L0) returns a support for both
Power10 and Power11 capabilities through H_GUEST_GET_CAPABILITIES hcall.
However, KVM currently supports only Power10 capabilities, resulting in
only Power10 capabilities being set as "nested capabilities" via an
H_GUEST_SET_CAPABILITIES hcall.
In the guest entry path, gs_msg_ops_kvmhv_nestedv2_config_fill_info() is
called by kvmhv_nestedv2_flush_vcpu() to fill the GSB (Guest State
Buffer) elements. The arch_compat is set to the logical PVR of Power11,
followed by an H_GUEST_SET_STATE hcall. This hcall returns
H_INVALID_ELEMENT_VALUE as a return code when setting a Power11 logical
PVR, as only Power10 capabilities were communicated as supported between
PHYP and KVM, utimately resulting in the KVM guest boot crash.
KVM: unknown exit, hardware reason ffffffffffffffea
NIP 000000007daf97e0 LR 000000007daf1aec CTR 000000007daf1ab4 XER 0000000020040000 CPU#0
MSR 8000000000103000 HID0 0000000000000000 HF 6c002000 iidx 3 didx 3
TB 00000000 00000000 DECR 0
GPR00 8000000000003000 000000007e580e20 000000007db26700 0000000000000000
GPR04 00000000041a0c80 000000007df7f000 0000000000200000 000000007df7f000
GPR08 000000007db6d5d8 000000007e65fa90 000000007db6d5d0 0000000000003000
GPR12 8000000000000001 0000000000000000 0000000000000000 0000000000000000
GPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20 0000000000000000 0000000000000000 0000000000000000 000000007db21a30
GPR24 000000007db65000 0000000000000000 0000000000000000 0000000000000003
GPR28 000000007db6d5e0 000000007db22220 000000007daf27ac 000000007db75000
CR 20000404 [ E - - - - G - G ] RES 000@ffffffffffffffff
SRR0 000000007daf97e0 SRR1 8000000000102000 PVR 0000000000820200 VRSAVE 0000000000000000
SPRG0 0000000000000000 SPRG1 000000000000ff20 SPRG2 0000000000000000 SPRG3 0000000000000000
SPRG4 0000000000000000 SPRG5 0000000000000000 SPRG6 0000000000000000 SPRG7 0000000000000000
CFAR 0000000000000000
LPCR 0000000000020400
PTCR 0000000000000000 DAR 0000000000000000 DSISR 0000000000000000
Fix this by adding the Power11 capability support and the required
plumbing in place.
Note:
* Booting a Power11 KVM nested PAPR guest requires [1] in QEMU.
[1] https://lore.kernel.org/all/20240731055022.696051-1-adityag@linux.ibm.com/
Signed-off-by: Amit Machhiwal <amachhiw@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241028101622.741573-1-amachhiw@linux.ibm.com
|
|
Remove hard-coded strings by using the str_enabled_disabled() helper
function.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241027222219.1173-2-thorsten.blum@linux.dev
|
|
The start_opd/end_opd members of struct mod_arch_specific are only
needed for kernels built using ELF ABI v1. Guard them with an ifdef to
save a little bit of space on ELF ABI v2 kernels.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20240812063312.730496-1-mpe@ellerman.id.au
|
|
sysfs_emit() helper function should be used when formatting the value
to be returned to user space.
This patch replaces open-coded sysfs_emit() in sysfs .show() callbacks
Link: https://github.com/KSPP/linux/issues/105
Signed-off-by: Paulo Miguel Almeida <paulo.miguel.almeida.rodenas@gmail.com>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/ZxMV3YvSulJFZ8rk@mail.google.com
|
|
Under certain conditions, the 64-bit '-mstack-protector-guard' flags may
end up in the 32-bit vDSO flags, resulting in build failures due to the
structure of clang's argument parsing of the stack protector options,
which validates the arguments of the stack protector guard flags
unconditionally in the frontend, choking on the 64-bit values when
targeting 32-bit:
clang: error: invalid value 'r13' in 'mstack-protector-guard-reg=', expected one of: r2
clang: error: invalid value 'r13' in 'mstack-protector-guard-reg=', expected one of: r2
make[3]: *** [arch/powerpc/kernel/vdso/Makefile:85: arch/powerpc/kernel/vdso/vgettimeofday-32.o] Error 1
make[3]: *** [arch/powerpc/kernel/vdso/Makefile:87: arch/powerpc/kernel/vdso/vgetrandom-32.o] Error 1
Remove these flags by adding them to the CC32FLAGSREMOVE variable, which
already handles situations similar to this. Additionally, reformat and
align a comment better for the expanding CONFIG_CC_IS_CLANG block.
Cc: stable@vger.kernel.org # v6.1+
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://patch.msgid.link/20241030-powerpc-vdso-drop-stackp-flags-clang-v1-1-d95e7376d29c@kernel.org
|
|
all failure exits prior to fdget() are returns, fdput() is immediately
followed by return.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
fdget() is the first thing done in scope, all matching fdput() are
immediately followed by leaving the scope.
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The systemcfg data has nothing to do anymore with the vdso.
Split it into a dedicated header file.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-27-b64f0842d512@linutronix.de
|
|
The systemcfg data only has minimal overlap with the vdso data.
Splitting the two avoids mapping the implementation-defined vdso data
into /proc/ppc64/systemcfg.
It is also a preparation for the standardization of vdso data storage.
The only field actually used by both systemcfg and vdso is
tb_ticks_per_sec and it is only changed once during time_init().
Initialize it in both structures there.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-26-b64f0842d512@linutronix.de
|
|
The systemcfg page through procfs is only a backwards-compatible
interface for very old applications.
Make it possible to be disabled.
This also creates a convenient config #define to guard any accesses to
the systemcfg page.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-25-b64f0842d512@linutronix.de
|
|
The systemcfg processorCount variable tracks currently online variables,
not possible ones, so the stored value is wrong.
The code preferably tries to use the ibm,lrdr-capacity field 4 which
"represents the maximum number of processors that the guest can have."
Switch from processorCount to the better matching num_possible_cpus().
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-24-b64f0842d512@linutronix.de
|
|
When printing the information "system_active_processors", the variable
partition_potential_processors is used instead of
partition_active_processors. The wrong value is displayed.
Use partition_active_processors instead.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-23-b64f0842d512@linutronix.de
|
|
If the operation fails and userspace is unaware it will access unmapped
memory, crashing the process.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-22-b64f0842d512@linutronix.de
|
|
This offset was copy-pasted from the systemcfg structure.
It has no meaning for the 32bit VDSO.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20241010-vdso-generic-base-v1-21-b64f0842d512@linutronix.de
|