Age | Commit message (Collapse) | Author |
|
Signed-off-by: Russell King <rmk@armlinux.org.uk>
|
|
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Ensure that the kernel size tag is an appropriate size before using
the information contained within it.
Signed-off-by: Russell King <rmk@armlinux.org.uk>
|
|
Since linux kernel has dropped support for simple firmware interface
(SFI), the only way of boot newer versions on intel MID platform is
using devicetree
Signed-off-by: Julian Winkler <julian.winkler1@web.de>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
An option to copy the command line arguments from running kernel
to kexec'd kernel. This option works for both kexec and kdump.
In case --append=<args> or --command-line=<args> is provided along
with --reuse-cmdline parameter then args listed against append and
command-line parameter will be combined with command line argument
from running kernel.
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
In order to pass fresh entropy to kexec'd kernels, use BI_RNG_SEED
for passing a seed, with the same semantics that kexec-tools currently
uses for i386's setup_data.
Link: https://git.kernel.org/torvalds/c/dc63a086daee92c63e3
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
In LoongArch, when using the --reuse-cmdline option to reuse the current
command line, it may lead to redundancy (like kexec, initrd command line
arguments). In order to avoid the possible impact of initrd removal on other
architectures, remove_parameter will be called in a specific architecture
for processing.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
The LoongArch kernel will mainly use the vmlinux.efi image in PE format,
so add it support.
I tested this on LoongArch 3A5000 machine and works as expected,
kexec:
$ sudo kexec -l /boot/vmlinux.efi --reuse-cmdline
$ sudo kexec -e
kdump:
$ sudo kexec -p /boot/vmlinux-kdump.efi --reuse-cmdline --append="nr_cpus=1"
# echo c > /proc/sysrq_trigger
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
Add the 64-bit processing support of the LoongArch architecture. For the time
being, the quick restart function(kexec) is supported. That is, the "kexec -l"
and "kexec -e" commands can be used normally.
At the same time, the crash dump function also supports, "kexec -p" operation
can be successfully performed, and the vmcore file can be generated.
I tested this on LoongArch 3A5000 machine and works as expected,
kexec:
$ sudo kexec -l /boot/vmlinux --reuse-cmdline
$ sudo kexec -e
kdump:
$ sudo kexec -p /boot/vmlinux-kdump --reuse-cmdline --append="nr_cpus=1"
# echo c > /proc/sysrq_trigger
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
Restricting kexec tool to allocate hole for kexec segments below 768MB
may not be relavent now since first memory block size can be 1024MB and
more.
Removing rma_top restriction will give more space to find holes for
kexec segments and existing in-place checks make sure that kexec segment
allocation doesn't cross the first memory block because every kexec segment
has to be within first memory block for kdump kernel to boot properly.
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
There exist duplicate ultoa() definitions in many archs, remove them,
and also redefine ultoa() in kexec/kexec.h to make it more readable.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
Linux ≥5.20 expects a RNG seed via setup_data as of the upstream commit
in the link below. That commit adjusts kexec_file_load to pass
SETUP_RNG_SEED. kexec-tools should follow suite, so add more or less the
same code here.
Link: https://git.kernel.org/tip/tip/c/68b8e9713c8
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
Under loongson platform, use command:
kexec -l vmlinux... --append="root=UUID=28e1..." --initrd=...
kexec -e
quick restart failed like this:
********************************************************************
[ 3.420791] VFS: Cannot open root device "UUID=6462a8a4-02fb-49..."
[ 3.431262] Please append a correct "root=" boot option; ...
...
...
...
[ 3.543175] 0801 4194304 sda1 554e69cc-01
[ 3.543175]
[ 3.549494] 0802 62914560 sda2 554e69cc-02
[ 3.549495]
[ 3.555818] 0803 8388608 sda3 554e69cc-03
[ 3.555819]
[ 3.562139] 0804 174553229 sda4 554e69cc-04
[ 3.562139]
[ 3.568463] 0b00 1048575 sr0
[ 3.568464] driver: sr
[ 3.574524] Kernel panic - not syncing: VFS: Unable to mount root fs...
[ 3.582750] ---[ end Kernel panic - not syncing: VFS:...
*******************************************************************
The kernel cannot parse the UUID, the UUID is parsed in the initrd.
For compatibility with previous platforms, loongson platform obtain
initrd parameter through cmdline in kernel, the kernel supports use
cmdline to parse initrd. But under the mips architecture, kexec-tools
pass the initrd through DTB.
Made the following modifications:
(1) in kexec/arch/mips/kexec-elf-mips.c
Add patch_initrd_info(), at runtime to distinguish different cpu,
only for loongson cpu, add initrd parameter to cmdline.
(2) in kexec/arch/mips/crashdump-mips.c
Because loongson uses a different page_offset, it should be modified
to ensure that crashdump functionality is correct and reliable.
(3) in kexec/arch/mips/crashdump-mips.h
Added platform-specific page_offset macro definition.
Signed-off-by: Hui Li <lihui@loongson.cn>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
On ARM64 based VMs hotplugging more than 31GB of memory will cause
kdump to fail loading as it's hitting the CRASH_MAX_MEMORY_RANGES
limit which is currently 32 on ARM64 given that the memory block size
is 1GB. This patch is raising CRASH_MAX_MEMORY_RANGES
to 32K similar to what we have on x86, this should allow
kdump to work until the VM has 32TB which should be
enough for a long time.
Signed-off-by: Hazem Mohamed Abuelfotoh <abuehaze@amazon.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Simon Horman <horms@kernel.org>
|
|
As for 'static data relocations', instead of patching an instruction (OR
ops), it should be assigned to value directly.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
elf_info.page_offset is 'unsigned long long', while get_page_offset()
has the input param as a type of 'unsigned long *'. It demands explicit
type casting to mute the compiler warning.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Build kexec-tools with clang(clang version 13.0.1 (Fedora 13.0.1-1.fc36)).
Then when kexec loads kernel, it runs into the error message
"machine_apply_elf_rel: ERROR Unknown type: 264".
This is caused by the following reloc type in purgatory/purgatory.ro,
which is not supported yet.
R_AARCH64_MOVW_UABS_G0_NC
R_AARCH64_MOVW_UABS_G1_NC
R_AARCH64_MOVW_UABS_G2_NC
R_AARCH64_MOVW_UABS_G3
Adding code to support these relocs, so kexec can work smoothly.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
More and more reloc type need to be supported on aarch64. Using enum to
organize them to shorten the #ifdef macro list.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
GCC 12 has some changes, which affects the generated AArch64 code of kexec-tools.
Accordingly, a new rel type R_AARCH64_LDST128_ABS_LO12_NC is confronted
by machine_apply_elf_rel() on AArch64. This fails the load of kernel
with the message "machine_apply_elf_rel: ERROR Unknown type: 299"
Citing from objdump -rDSl purgatory/purgatory.ro
0000000000000f80 <sha256_starts>:
sha256_starts():
f80: 90000001 adrp x1, 0 <verify_sha256_digest>
f80: R_AARCH64_ADR_PREL_PG_HI21 .text+0xfa0
f84: a9007c1f stp xzr, xzr, [x0]
f88: 3dc00021 ldr q1, [x1]
f88: R_AARCH64_LDST128_ABS_LO12_NC .text+0xfa0
f8c: 90000001 adrp x1, 0 <verify_sha256_digest>
f8c: R_AARCH64_ADR_PREL_PG_HI21 .text+0xfb0
f90: 3dc00020 ldr q0, [x1]
f90: R_AARCH64_LDST128_ABS_LO12_NC .text+0xfb0
f94: ad008001 stp q1, q0, [x0, #16]
f98: d65f03c0 ret
f9c: d503201f nop
fa0: 6a09e667 .inst 0x6a09e667 ; undefined
fa4: bb67ae85 .inst 0xbb67ae85 ; undefined
fa8: 3c6ef372 .inst 0x3c6ef372 ; undefined
fac: a54ff53a ld3w {z26.s-z28.s}, p5/z, [x9, #-3, mul vl]
fb0: 510e527f sub wsp, w19, #0x394
fb4: 9b05688c madd x12, x4, x5, x26
fb8: 1f83d9ab .inst 0x1f83d9ab ; undefined
fbc: 5be0cd19 .inst 0x5be0cd19 ; undefined
Here, gcc generates codes, which make loads and stores carried out using
the 128-bits floating-point registers. And a new rel type
R_AARCH64_LDST128_ABS_LO12_NC should be handled.
Make machine_apply_elf_rel() coped with this new reloc, so kexec-tools
can work smoothly.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Use concat_cmdline() to concatenate the --append string and
the --reuse-cmdline string, otherwise only one of the two
options is valid.
This is similar with commit 8b42c99aa3bc ("Fix --reuse-cmdline
so it is usable.").
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Use dbgprintf() to print command_line, initrd and dtb
in arch_process_options() for debugging.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Since kernel commit 14c127c957c1 ('arm64: mm: Flip kernel VA space'),
the memory layout on arm64 have changed, and kexec-tools can no longer
get the the right PAGE_OFFSET based on _text symbol.
Prior to that, the kimage (_text) lays above PAGE_END with this layout:
0 -> VA_START : Usespace
VA_START -> VA_START + 256M : BPF JIT, Modules
VA_START + 256M -> PAGE_OFFSET - (~GB misc) : Vmalloc (KERNEL _text HERE)
PAGE_OFFSET -> ... : * Linear map *
And here we have:
VA_START = -1UL << VA_BITS
PAGE_OFFSET = -1UL << (VA_BITS - 1)
_text < -1UL << (VA_BITS - 1)
Kernel image lays somewhere between VA_START and PAGE_OFFSET, so we just
calc VA_BITS by getting the highest unset bit of _text symbol address,
and shift one less bit of VA_BITS to get page offset. This works as long
as KASLR don't put kernel in a too high location (which is commented inline).
And after that commit, kernel layout have changed:
0 -> PAGE_OFFSET : Userspace
PAGE_OFFSET -> PAGE_END : * Linear map *
PAGE_END -> PAGE_END + 128M : bpf jit region
PAGE_END + 128M -> PAGE_END + 256MB : modules
PAGE_END + 256M -> ... : vmalloc (KERNEL _text HERE)
Here we have:
PAGE_OFFSET = -1UL << VA_BITS
PAGE_END = -1UL << (VA_BITS - 1)
_text > -1UL << (VA_BITS - 1)
Kernel image now lays above PAGE_END, so we have to shift one more bit to
get the VA_BITS, and shift the exact VA_BITS for PAGE_OFFSET.
We can simply check if "_text > -1UL << (VA_BITS - 1)" is true to judge
which layout is being used and shift the page offset occordingly.
Signed-off-by: Kairui Song <kasong@tencent.com>
(rebased and stripped by Pingfan )
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
phys_to_virt() calculates virtual address. As a important factor,
page_offset is excepted to be accurate.
Since arm64 kernel exposes va_bits through vmcore, using it.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
There are two funcs to get page_offset:
get_kernel_page_offset()
get_page_offset()
Since get_kernel_page_offset() does not observe the kernel formula, and
remove it. Unify them in order to introduce 52-bits VA kernel more
easily in the coming patch.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
After kernel commit 7bc1a0f9e176 ("arm64: mm: use single quantity to
represent the PA to VA translation"), phys_offset can be negative if
running 52-bits kernel on 48-bits hardware.
So changing phys_offset from unsigned to signed.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
--reuse-cmdline reads the command line of the currently
running kernel from /proc/cmdline and uses that for the
kernel that should be kexec'd.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
KEXEC_ALL_OPTIONS could be used instead defining the same
array several times. This makes code easier to maintain when
new options are added.
Suggested-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Newer s390 kernels support a command line size longer than 896
bytes. Such kernels contain a new member in the parameter area,
which might be utilized by tools like kexec. Older kernels have
the location initialized to zero, so we check whether there's a
non-zero number present and use that. If there isn't, we fallback
to the legacy command line size of 896 bytes.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
When crashkernel is reserved above 4G in memory, kernel should
reserve some amount of low memory for swiotlb and some DMA buffers.
So there may be two crash kernel regions, one is below 4G, the other
is above 4G.
Currently, there is only one crash kernel region on arm64, and pass
"linux,usable-memory-range = <BASE SIZE>" property to crash dump
kernel.
Now, we pass "linux,usable-memory-range = <BASE1 SIZE1 BASE2 SIZE2>"
to crash dump kernel to support two crash kernel regions and load crash
kernel high. Make the low memory region as the second range "BASE2 SIZE2"
to keep compatibility with existing user-space and older kdump kernels.
Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Co-developed-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Starting with gcc 11.3, the C compiler will generate PLT-relative function
calls even if they are local and do not require it. Later on during linking,
the linker will replace all PLT-relative calls to local functions with
PC-relative ones. Unfortunately, the purgatory code of kexec/kdump is
not being linked as a regular executable or shared library would have been,
and therefore, all PLT-relative addresses remain in the generated purgatory
object code unresolved. This in turn lets kexec-tools fail with
"Unknown rela relocation: 0x14 0x73c0901c" for such relocation types.
Furthermore, the clang C compiler has always behaved like described above
and this commit should fix the purgatory code built with the latter.
Because the purgatory code is no regular executable or shared library,
contains only calls to local functions and has no PLT, all R_390_PLT32DBL
relocation entries can be resolved just like a R_390_PC32DBL one.
* https://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_zSeries/x1633.html#AEN1699
Relocation entries of purgatory code generated with gcc 11.3
------------------------------------------------------------
$ readelf -r purgatory/purgatory.o
Relocation section '.rela.text' at offset 0x6e8 contains 27 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000000000c 000300000013 R_390_PC32DBL 0000000000000000 .data + 2
00000000001a 001000000014 R_390_PLT32DBL 0000000000000000 sha256_starts + 2
000000000030 001100000014 R_390_PLT32DBL 0000000000000000 sha256_update + 2
000000000046 001200000014 R_390_PLT32DBL 0000000000000000 sha256_finish + 2
000000000050 000300000013 R_390_PC32DBL 0000000000000000 .data + 102
00000000005a 001300000014 R_390_PLT32DBL 0000000000000000 memcmp + 2
...
000000000118 001600000014 R_390_PLT32DBL 0000000000000000 setup_arch + 2
00000000011e 000300000013 R_390_PC32DBL 0000000000000000 .data + 2
00000000012c 000f00000014 R_390_PLT32DBL 0000000000000000 verify_sha256_digest + 2
000000000142 001700000014 R_390_PLT32DBL 0000000000000000
post_verification[...] + 2
Relocation entries of purgatory code generated with gcc 11.2
------------------------------------------------------------
$ readelf -r purgatory/purgatory.o
Relocation section '.rela.text' at offset 0x6e8 contains 27 entries:
Offset Info Type Sym. Value Sym. Name + Addend
00000000000e 000300000013 R_390_PC32DBL 0000000000000000 .data + 2
00000000001c 001000000013 R_390_PC32DBL 0000000000000000 sha256_starts + 2
000000000036 001100000013 R_390_PC32DBL 0000000000000000 sha256_update + 2
000000000048 001200000013 R_390_PC32DBL 0000000000000000 sha256_finish + 2
000000000052 000300000013 R_390_PC32DBL 0000000000000000 .data + 102
00000000005c 001300000013 R_390_PC32DBL 0000000000000000 memcmp + 2
...
00000000011a 001600000013 R_390_PC32DBL 0000000000000000 setup_arch + 2
000000000120 000300000013 R_390_PC32DBL 0000000000000000 .data + 122
000000000130 000f00000013 R_390_PC32DBL 0000000000000000 verify_sha256_digest + 2
000000000146 001700000013 R_390_PC32DBL 0000000000000000 post_verification[...] + 2
Corresponding s390 kernel discussion:
* https://lore.kernel.org/linux-s390/20211208105801.188140-1-egorenar@linux.ibm.com/T/#u
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reported-by: Tao Liu <ltao@redhat.com>
Suggested-by: Philipp Rudo <prudo@redhat.com>
Reviewed-by: Philipp Rudo <prudo@redhat.com>
[hca@linux.ibm.com: changed commit message as requested by Philipp Rudo]
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
kexec-tools commit 61b8c79b0fb7 ("arm64/crashdump-arm64: deduce the
paddr of _text") tries to deduce the paddr of _text, but turns out
partially.
That commit is based on "The Image must be placed text_offset bytes from
a 2MB aligned base address anywhere in usable system RAM and called
there" in linux/Documentation/arm64/booting.rst, plus text_offset field
is zero.
But in practice, some boot loaders does not obey the convention, and
still boots up the kernel successfully.
Revisiting kernel commit e2a073dde921 ("arm64: omit [_text, _stext) from
permanent kernel mapping"), the kernel code size changes from (unsigned
long)__init_begin - (unsigned long)_text to (unsigned long)__init_begin
- (unsigned long)_stext
And it should be a better factor to decide which label starts the
"Kernel code" in /proc/iomem.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Pass the following properties to the crash dump kernel, to provide a
modern DT interface between kexec and the crash dump kernel:
- linux,elfcorehdr: ELF core header segment, similar to the
"elfcorehdr=" kernel parameter.
- linux,usable-memory-range: Usable memory reserved for the crash dump
kernel.
This makes the memory reservation explicit, so Linux no longer needs
to mask the program counter, and rely on the "mem=" kernel parameter
to obtain the start and size of usable memory.
For backwards compatibility, the "elfcorehdr=" and "mem=" kernel
parameters are still appended to the kernel command line.
Loosely based on the ARM64 version by Akashi Takahiro.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
mem_lower and mem_upper are measured in kilobytes.
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
fclose should be called before function exits
Signed-off-by: Kai Song <songkai01@inspur.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
When the function exits abnormally,ph should be freed.
Signed-off-by: Kai Song <songkai01@inspur.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In get_kernel_page_offset(),the local variable kv is unused,remove it.
Signed-off-by: Kai Song <songkai01@inspur.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In some cases, add_buffer will actually try to allocate the buffer
at 0x0, which may not be acceptable by some kernels. Let's avoid
the first 0x500 bytes so we don't screw up the IVT and BDA.
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
This would segfault if mhi.rel_tag didn't exist.
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
tag_load_base_addr is dependent on rel_tag, and tag_framebuffer was
not accounted for.
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Since kernel commit e2a073dde921 ("arm64: omit [_text, _stext) from
permanent kernel mapping"), the physical address of 'Kernel code' in
/proc/iomem is mapped from _text, instead, from _stext.
Taking the compatibility into account, it had better deduce the paddr of
_text despite of the unavailability through /proc/iomem. It can be
achieved by utilizing the fact _text aligned on 2MB.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Cc: Simon Horman <horms@verge.net.au>
To: kexec@lists.infradead.org
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
The ramdisk variable is defined in kexec/arch/ppc/kexec-ppc.c. This
other definition is not needed and breaks build with -fno-common.
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
If the passed zImage happens to have a DTB appended, then the magic 4 bytes
of the DTB are copied together with the kernel image. This leads to
failed kexec boots because the decompressor finds the aforementioned
DTB magic and falsely tries to replace the DTB passed in the register r2
with the non-existent appended one.
Signed-off-by: Alexander Egorenkov <egorenar-dev@posteo.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
During kexec there are two kernel versions at play. The version of
the running kernel and the version of the kernel that will be booted.
On powerpc it appears people have been using the version of the
running kernel to attempt to detect properties of the kernel to be
booted which is just wrong. As the linux kernel version that is being
detected is a no longer supported kernel just remove that buggy and
confused code.
On x86_64 the kernel_version is used to compute the starting virtual
address of the running kernel so a proper core dump may be generated.
Using the kernel_version stopped working a while ago when the starting
virtual address became randomized.
The old code was kept for the case where the kernel was not built with
randomization support, but there is nothing in reading /proc/kcore
that won't work to detect the starting virtual address even there.
In fact /proc/kcore must have the starting virtual address or a
debugger can not make sense of the running kernel.
So just make computing the starting virtual address on x86_64
unconditional. With a hard coded fallback just in case something went
wrong.
Doing something with kernel_version() has become important as recent
stable kernels have seen the minor version to > 255. Just removing
kernel_version() looks like the best option.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
We risk throwing an entire large chunk away if it is just slightly
unaligned which then causes the crash kernel to run out of RAM. Keep
them and shrink them to alignment.
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
The real mode ends at 0x400, not 0x100. The code intentionally excludes
the IVT as RAM, so use the correct address.
Also, 0x100 is not 1K aligned and will be rejected by add_memmap(). We
have observed problems that after a multiboot2 kexec, the next kexec
will throw away such unaligned chunks, losing memory for the next next
kernel. In some corner cases, such loss of memory can actually cause OOM
during boot.
Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
virtio-mem in Linux adds/removes individual memory blocks (e.g., 128 MB
each). Linux merges adjacent memory blocks added by virtio-mem devices, but
we can still end up with a very sparse memory layout when unplugging
memory in corner cases.
Let's increase the maximum number of crash memory ranges from ~2k to 32k.
32k should be sufficient for a very long time.
e_phnum field in the header is 16 bits wide, so we can fit a maximum of
~64k entries in there, shared with other entries (i.e., CPU). Therefore,
using up to 32k memory ranges is fine. (if we ever need more than ~64k,
we can switch to the sh_info field)
Move the temporary xen ranges off the stack, dynamically allocating
memory for them.
Note: We don't have to increase MAX_MEMORY_RANGES, because virtio-mem
added memory is driver managed and always detected and added by a
driver in the kexec'ed kernel; for ordinary kexec, we must not expose
these ranges in the firmware-provided memmap.
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
No need to iterate over empty entries.
Cc: Simon Horman <horms@verge.net.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Traditionally, we had "System RAM" only on the top level of in the
kernel resource tree (-> /proc/iomem). Nowadays, we can also have
"System RAM" on lower levels of the tree -- driver-managed device memory
that is always detected and added via drivers. Current examples are
memory added via dax/kmem -- ("System RAM (kmem)") and virtio-mem ("System
RAM (virtio_mem)"). Note that in some kernel versions "System RAM
(kmem)" was exposed as "System RAM", but similarly, on lower levels of
the resource tree.
Let's add anything that contains "System RAM" to the elf core header, so
it will be dumped for kexec_load(). Handling kexec_file_load() in the
kernel is similarly getting fixed [1].
Loading a kdump kernel via "kexec -p -c" ... will result in the kdump
kernel to also dump dax/kmem and virtio-mem added System RAM now.
Note: We only want to dump this memory, we don't want to add this memory to
the memmap of an ordinary kexec'ed kernel ("fast system reboot").
[1] https://lkml.kernel.org/r/20210322160200.19633-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|