Age | Commit message (Collapse) | Author |
|
Currently the kexec_file_load() support for arm64 doesn't allow
handling zlib compressed (i.e. Image.gz) image.
Since most distributions use 'make zinstall' rule inside
'arch/arm64/boot/Makefile' to install the arm64
Image.gz compressed file inside the boot destination directory (for e.g.
/boot), currently we cannot use kexec_file_load() to load vmlinuz (or
Image.gz):
# file /boot/vmlinuz
/boot/vmlinuz: gzip compressed data, was "Image", <..snip..>, max
compression, from Unix, original size 21945120
Now, since via kexec_file_load() we pass the 'fd' of Image.gz
(compressed file) via the following command line ...
# kexec -s -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname
-r`.img --reuse-cmdline
... kernel returns -EINVAL error value, as it is not able to locate
the magic number =0x644d5241, which is expected in the 64-byte header
of the decompressed kernel image.
We can fix this in user-space kexec-tools, which handles an
'Image.gz' being passed via kexec_file_load(), using an approach
as follows:
a). Copy the contents of Image.gz to a temporary file.
b). Decompress (gunzip-decompress) the contents inside the
temporary file.
c). Pass the 'fd' of the temporary file to the kernel space. So
basically the kernel space still gets a decompressed kernel
image to load via kexec-tools
I tested this patch for the following three use-cases:
1. Uncompressed Image file:
#kexec -s -l Image --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline
2. Signed Image file:
#kexec -s -l Image.signed --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline
3. zlib compressed Image.gz file:
#kexec -s -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Commit bf06cf2095e1 ("kexec/uImage: probe to identify a corrupted image"),
defined the 'uImage_probe_kernel()' function return values and
correspondingly ;uImage_arm64_probe()' returns the same (0 -> If the
image is valid 'type' image, -1 -> If the image is corrupted and
1 -> If the image is not a uImage).
This causes issues because, in later patches we introduce zImage
support for arm64, and since it is probed after uImage, the return
values from 'uImage_arm64_probe()' needs to be fixed to make sure
that kexec will not return with an invalid error code.
Now, 'uImage_arm64_probe()' returns the following values instead:
0 - valid uImage.
-1 - uImage is corrupted.
1 - image is not a uImage.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Fixes: 22a2ed55132e ("x86: Support multiboot2 images")
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
xenctrl.h defines struct e820entry as:
if defined(__i386__) || defined(__x86_64__)
...
#define E820_RAM 1
...
struct e820entry {
uint64_t addr;
uint64_t size;
uint32_t type;
} __attribute__((packed));
...
#endif
$ dpkg-query -S /usr/include/xenctrl.h
libxen-dev:amd64: /usr/include/xenctrl.h
$ dpkg-query -W libxen-dev:amd64
libxen-dev:amd64 4.8.5+shim4.10.2+xsa282-1+deb9u11
./include/x86/x86-linux.h defines struct e820entry as:
#ifndef E820_RAM
struct e820entry {
uint64_t addr; /* start of memory segment */
uint64_t size; /* size of memory segment */
uint32_t type; /* type of memory segment */
#define E820_RAM 1
...
} __attribute__((packed));
#endif
Since cedeee0a3007 ("x86: Introduce helpers for getting RSDP address")
./kexec/arch/i386/kexec-x86-common.c includes
+#include "x86-linux-setup.h"
#include "../../kexec-xen.h"
When xenctrl.h is present the above results in:
$ gcc
...
In file included from kexec/arch/i386/../../kexec-xen.h:5:0,
from kexec/arch/i386/kexec-x86-common.c:43:
/usr/include/xenctrl.h:1271:8: error: redefinition of 'struct e820entry'
struct e820entry {
^~~~~~~~~
In file included from kexec/arch/i386/x86-linux-setup.h:3:0,
from kexec/arch/i386/kexec-x86-common.c:42:
./include/x86/x86-linux.h:16:8: note: originally defined here
struct e820entry {
^~~~~~~~~
...
$ gcc --version | head -1
gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
To militate this this problem re-order the includes so that
x86-linux.h is included after xenctrl.h and thus
struct e820entry will only be defined once due to it
being devined conditionally in x86-linux.h.
In practice the definitions are the same so it should
not matter which is chosen.
It also seems rather unpleasent to me to need to play
with include ordering. Perhaps a better solution in the longer
term would be to rename the local definition of struct e820entry.
Fixes: cedeee0a3007 ("x86: Introduce helpers for getting RSDP address")
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Add a new type `multiboot2-x86` that allows loading multiboot2 [1] images
within the relocation range specified in the image header. The image is
always placed at the lowest available address, regardless of the
preference information.
[1] https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html
Signed-off-by: Varad Gautam <vrd@amazon.de>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Use the new introduce helper for getting RSDP, this ensures RSDP is
always accessible and avoid code duplication.
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Since kernel commit e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP address
from boot params if available"), kernel accept an acpi_rsdp_addr param in
boot_params. So fill in this parameter unconditionally, ensure second
kernel always get the right RSDP address consistently, and boot well on
EFI system even with EFI service disabled. User no longer need to change
the kernel cmdline to workaround the missing RSDP issue.
For older version of kernels (Before 5.0), there won't be any change of
behavior.
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
On x86 RSDP is fundamental for booting the machine. When second kernel
is incapable of parsing the RSDP address (eg. kexec next kernel on an EFI
system with EFI service disabled), kexec should prepare the RSDP address
for second kernel.
Introduce helpers for getting RSDP from multiple sources, including boot
params and EFI firmware.
For legacy BIOS interface, there is no better way to find the RSDP address
rather than scanning the memory region and search for it, and this will
always be done by the kernel as a fallback, so this is no need to try to
get the RSDP address for that case.
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
The name in mount invocations like
mount -t debugfs debugfs /sys/kernel/debug
is nothing but convention and cannot be relied upon.
For example, https://www.kernel.org/doc/Documentation/filesystems/debugfs.txt
recommends making the name "none" instead:
mount -t debugfs none /sys/kernel/debug
and many existing systems use mounts named "none" or otherwise.
Using `mnt_type` instead of `mnt_fsname` allows kexec to work
on such systems.
This fixes another instance of `poweroff` not working on kexec'ed
kernels because the lack of correctly matched mount results in EFI
variables not being read and propagated.
Signed-off-by: Niklas Hambüchen <mail@nh2.me>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In many situations, especially on read-only file systems
and initial ramdisks (intramfs/initrd), /etc/mtab does not exist.
Before this commit, kexec would fail to read mounts on such systems
in `find_mnt_by_fsname()`, such that `get_bootparam()` would not
`boot_params/data`, which would then lead to e.g. `setup_efi_data()`
not being called in `setup_efi_info()`.
As a result, kexec'ed kernels would not obtain EFI data,
subsequentially lack an `ACPI RSDP` entry, emitting:
ACPI BIOS Error (bug): A valid RSDP was not found (20180810/tbxfroot-210)
and thus fail to turn off the machine on poweroff, instead printing only:
reboot: System halted
This problem had to be worked around by passing `acpi_rsdp=` manually
before. This commit obviates this workaround.
See also:
* https://github.com/coreos/bugs/issues/167#issuecomment-487320879
* http://lists.infradead.org/pipermail/kexec/2012-October/006924.html
Signed-off-by: Niklas Hambüchen <mail@nh2.me>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
After commit 060eee58 "x86: use old screen_info if needed", kexec-tools
will force use old screen_info and vga type if failed to determine
current vga type. But it is not always a good idea.
Currently kernel hanging is inspected on some hyper-v VMs after this
commit, because hyperv_fb will mimic EFI (or VESA) VGA on first boot
up, but after the real driver is loaded, it will switch to new mode
and no longer compatible with EFI/VESA VGA. Keep setting
orig_video_isVGA to EFI/VESA VGA flag will get wrong driver loaded and
try to manipulate the framebuffer in a wrong way.
We can't ensure this won't happen on other framebuffer drivers, But
it's a helpful feature if the framebuffer drivers just work. So this
patch introduce a --reuse-video-type options to let user decide if the
old screen_info hould be used unconditional or not.
Signed-off-by: Kairui Song <kasong@redhat.com>
Reviewed-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
When copying the DTB from the current kernel, if the user didn't pass an
initrd on the command-line, make sure that the new DTB doesn't contain
initrd properties with stale addresses. Otherwise the next kernel will
try to unpack the initramfs from a location that contains junk, since
the initial initrd is long gone:
[ 49.370026] Initramfs unpacking failed: junk in compressed archive
This issue used to be hidden by a successful recovery, but since commit
ff1522bb7d98 ("initramfs: cleanup incomplete rootfs") in Linux, the
kernel removes the default /root mountpoint after failing to load an
initramfs, and cannot mount the rootfs passed on the command-line
anymore.
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In a EFI system, the frame buffer address is 64bit, so currently
if the address is beyound 4G, kexec will set wrong address due to
truncate.
Linux kernel commit ae2ee627dc87 ('efifb: Add support for 64-bit
frame buffer addresses') added support for 64bit frame buffer
address, an 'ext_lfb_base' field is added as the upper 32-bits of
the frame buffer, and introduced a new capability flag
'VIDEO_TYPE_CAPABILITY_64BIT_BASE' to indicate if the extend field is
used.
This patch adopts this change, set proper extent address and capability
flag when the address is beyound 4G.
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
When the kernel requests video information, pass it the
framebuffer information in the multiboot header from
the linux framebuffer ioctl's.
With the arch specific --reset-vga or --consolve-vga
options, purgatory will reset the framebuffer so pass
information for standard ega text mode.
Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Use the appropriate types for ACPI reclaim and ACPI NVS
ranges in the multiboot memory map.
This allows the kernel to locate ACPI tables on UEFI
systems without having a explicit pointer to the RSD.
Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Add support for non-elf multiboot kernels (such as Plan 9)
by handling the MULTIBOOT_AOUT_KLUDGE bit.
When the bit is clear then we are dealing with an ELF file
and probe for ELF as before with elf_x86_probe().
When the bit is set then load_addr, load_end_addr, header_addr
and entry_addr from the multiboot header are used load the
memory image.
Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
With this patch, kexec_file_load() system call is supported.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com>
Tested-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
available)
On certain arm64 platforms, it has been noticed that due
to a hole at the start of physical ram exposed to kernel
(i.e. it doesn't start from address 0), the kernel still
calculates the 'memstart_addr' kernel variable as 0.
Whereas the SYSTEM_RAM or IOMEM_RESERVED range in '/proc/iomem'
would carry a first entry whose start address is non-zero
(as the physical ram exposed to the kernel starts from a
non-zero address).
In such cases, if we rely on '/proc/iomem' entries to
calculate the phys_offset, then we will have mismatch
between the user-space and kernel space 'PHYS_OFFSET'
value. The present 'kexec-tools' code does the same
in 'get_memory_ranges_iomem_cb()' function when it makes
a call to 'set_phys_offset()'. This can cause the vmcore
generated via 'kexec-tools' to miss the last few bytes as
the first '/proc/iomem' starts from a non-zero address.
Please see [0] for the original bug-report from Yanjiang Jin.
The same can be fixed in the following manner:
1. For newer kernel (>= 4.19, with commit 23c85094fe1895caefdd
["proc/kcore: add vmcoreinfo note to /proc/kcore"] available),
'kcore' contains a new PT_NOTE which carries the VMCOREINFO
information.
If the same is available, one should prefer the same to
retrieve 'PHYS_OFFSET' value exported by the kernel as this
is now the standard interface exposed by kernel for sharing
machine specific details with the user-land as per
the arm64 kernel maintainers (see [1]) .
2. For older kernels, we can try and determine the PHYS_OFFSET
value from PT_LOAD segments inside 'kcore' via some jugglery
of the correct virtual and physical address combinations.
As a fallback, we still support getting the PHYS_OFFSET values
from '/proc/iomem', to maintain backward compatibility.
Testing:
-------
- Tested on my apm-mustang and qualcomm amberwing board with upstream
kernel (4.20.0-rc7) for both KASLR and non-KASLR boot cases.
References:
-----------
[0] https://www.spinics.net/lists/kexec/msg20618.html
[1] https://www.mail-archive.com/kexec@lists.infradead.org/msg20300.html
Reported-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com>
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
'set_bootargs()'
This patch adds missing error handling check against the return value of
'set_bootargs()' in 'kexec-arm64.c'
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
For calculating the random 'kaslr-seed' value to be passed to the
secondary kernel (kexec or kdump), we invoke the 'getrandom' syscall
inside 'setup_2nd_dtb()' function.
Normally on most arm64 systems this syscall doesn't fail when the
initrd scriptware (which arms kdump service) invokes the same.
However, recently I noticed that on the 'hp-moonshot' arm64 boards,
we have an issue with the newer kernels which causes the same
to fail. As a result, the kdump service fails and we are not able
to use the kdump infrastructure just after boot. As expected, once the
random pool is sufficiently populated and we launch the kdump service
arming scripts again (manually), then the kdump service is properly
enabled.
Lets handle the same, by not error'ing out if 'getrandom' syscall fails.
Rather lets warn the user and proceed further by setting the
'kaslr-seed' value as 0 for the secondary kernel - which implies that it
boots in a 'nokaslr' mode.
Tested on my 'hp-moonshot' and 'qualcomm-amberwing' arm64 boards.
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
If the err_out label is reached, address of a stack variable is passed to
free(). Fix it.
Signed-off-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
When kexec-tools load the kernel and initramfs for kdump, kexec-tools will
read /proc/iomem and recreate the e820 ranges for kdump kernel. But it fails
to parse the e820 reserved region, because the memcmp() is case sensitive
when comparing the string. In fact, it may be "Reserved" or "reserved" in
the /proc/iomem, so we have to fix these cases.
Signed-off-by: Lianbo Jiang <lijiang@redhat.com>
Reviewed-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In response to a change in binutils, commit b21ebf2fb4c
(x86: Treat R_X86_64_PLT32 as R_X86_64_PC32) was applied to
the linux kernel during the 4.16 development cycle and has
since been backported to earlier stable kernel series. The
change results in the failure message in $SUBJECT when
rebooting via kexec.
Fix this by replicating the change in kexec.
Signed-off-by: Chris Clayton <chris2553@googlemail.com>
Acked-by: Baoquan He <bhe@redhat.com>
Tested-by: Bhupesh Sharma <bhsharma@redhat.com>
Acked-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Currently, in arm64, kexec silently truncates kernel command line longer
than COMMAND_LINE_SIZE - 1. Error out in that case as some other
architectures already do that. The error message is copied from x86_64.
Suggested-by: Tom Kirchner <tjk@amazon.com>
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Otherwise, we can hit the current 512 chars limit before hitting the
Linux kernel's one, where allows 2048 chars in arm64.
Signed-off-by: Munehisa Kamata <kamatam@amazon.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
This patch adds the support to supply 'kaslr-seed' to secondary kernel,
when we do a 'kexec warm reboot to another kernel' (although the
behaviour remains the same for the 'kdump' case as well) on arm64
platforms using the 'kexec_load' invocation method.
Lets consider the case where the primary kernel working on the arm64
platform supports kaslr (i.e 'CONFIG_RANDOMIZE_BASE' was set to y and
we have a compliant EFI firmware which supports EFI_RNG_PROTOCOL and
hence can pass a non-zero (valid) seed to the primary kernel).
Now the primary kernel reads the 'kaslr-seed' and wipes it to 0 and
uses the seed value to randomize for e.g. the module base address
offset.
In the case of 'kexec_load' (or even kdump for brevity),
we rely on the user-space kexec-tools to pass an appropriate dtb to the
secondary kernel and since 'kaslr-seed' is wiped to 0 by the primary
kernel, the secondary will essentially work with *nokaslr* as
'kaslr-seed' is set to 0 when it is passed to the secondary kernel.
This can be true even in case the secondary kernel had
'CONFIG_RANDOMIZE_BASE' and 'CONFIG_RANDOMIZE_MODULE_REGION_FULL' set to
y.
This patch addresses this issue by first checking if the device tree
provided by the firmware to the kernel supports the 'kaslr-seed'
property and verifies that it is really wiped to 0. If this condition is
met, it fixes up the 'kaslr-seed' property by using the getrandom()
syscall to get a suitable random number.
I verified this patch on my Qualcomm arm64 board and here are some test
results:
1. Ensure that the primary kernel is boot'ed with 'kaslr-seed'
dts property and it is really wiped to 0:
[root@qualcomm-amberwing]# dtc -I dtb -O dts /sys/firmware/fdt | grep -A 10 -i chosen
chosen {
kaslr-seed = <0x0 0x0>;
...
}
2. Now issue 'kexec_load' to load the secondary kernel (let's assume
that we are using the same kernel as the secondary kernel):
# kexec -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname
-r`.img --reuse-cmdline -d
3. Issue 'kexec -e' to warm boot to the secondary:
# kexec -e
4. Now after the secondary boots, confirm that the load address of the
modules is randomized in every successive boot:
[root@qualcomm-amberwing]# cat /proc/modules
sunrpc 524288 1 - Live 0xffff0307db190000
vfat 262144 1 - Live 0xffff0307db110000
fat 262144 1 vfat, Live 0xffff0307db090000
crc32_ce 262144 0 - Live 0xffff0307d8c70000
...
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Since kernel 4.17-rc2 s390 supports the kexec_file_load system call. Add
the new system call to kexec-tools and provide the -s (--kexec-file-syscall)
option for s390 to support this new feature.
Signed-off-by: Philipp Rudo <prudo@linux.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Fixes warnings like these when building kexec for powerpc (32 bit):
kexec-elf-rel-ppc64.c: warning: cast from pointer to integer of different size
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Fixes warnings like these when building kexec for powerpc (32 bit):
crashdump-ppc64.h: warning: large integer implicitly truncated to unsigned type
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
|
|
PPC64 kernel now supports kexec_file_load system call. Leverage it by
enabling that support here. Note that loading crash kernel with this
system call is not yet supported in the kernel and trying to load one
will fail with '-ENOTSUPP' error.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Include the stack and malloc space in our calculation of the zImage
size, both of which must be avoided when locating the dtb.
Signed-off-by: Russell King <rmk@armlinux.org.uk>
|
|
Add further debugging of the kernel size
Signed-off-by: Russell King <rmk@armlinux.org.uk>
|
|
Signed-off-by: Russell King <rmk@armlinux.org.uk>
|
|
Add support to parse the new 'ibm,dynamic-memory-v2' property in the
'ibm,dynamic-reconfiguration-memory' node. This replaces the old
'ibm,dynamic-memory' property and is enabled in the kernel with a
patch series that starts with commit 0c38ed6f6f0b ("powerpc/pseries:
Enable support of ibm,dynamic-memory-v2"). All LMBs that share the same
flags and are adjacent are grouped together in the newer version of the
property making it compact to represent larger memory configurations.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
With modern drm/kms graphic driver kexec-tools does not setup screen_info
correctly so one will only see screen output after those drm drivers
reinitializing after rebooting. Copying the old screen info from original
boot_params will help during my test, although it could not work for some
potential cases, but it is not worse than before. This has been used in
the kernel kexec_file_load.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
When kexec is utilized in a Xen environment, it has an explicit
run-time dependency on libxenctrl.so. This dependency occurs
during the configure stage and when building kexec-tools.
When kexec is utilized in a non-Xen environment (either bare
metal or KVM), the configure and build of kexec-tools omits
any reference to libxenctrl.so.
Thus today it is not currently possible to configure and build
a *single* kexec that will work in *both* Xen and non-Xen
environments, unless the libxenctrl.so is *always* present.
For example, a kexec configured for Xen in a Xen environment:
# ldd build/sbin/kexec
linux-vdso.so.1 => (0x00007ffdeba5c000)
libxenctrl.so.4.4 => /usr/lib64/libxenctrl.so.4.4 (0x00000038d8000000)
libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000)
libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000)
/lib64/ld-linux-x86-64.so.2 (0x000055e9f8c6c000)
# build/sbin/kexec -v
kexec-tools 2.0.16
However, the *same* kexec executable fails in a non-Xen environment:
# copy xen kexec to .
# ldd ./kexec
linux-vdso.so.1 => (0x00007fffa9da7000)
libxenctrl.so.4.4 => not found
liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x0000003014e00000)
libz.so.1 => /lib64/libz.so.1 (0x000000300ea00000)
libc.so.6 => /lib64/libc.so.6 (0x000000300de00000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x000000300e200000)
/lib64/ld-linux-x86-64.so.2 (0x0000558cc786c000)
# ./kexec -v
./kexec: error while loading shared libraries:
libxenctrl.so.4.4: cannot open shared object file: No such file or directory
At Oracle we "workaround" this by having two kexec-tools packages,
one for Xen and another for non-Xen environments. At Oracle, the
desire is to offer a single kexec-tools package that works in either
environment. To achieve this, kexec-tools would either have to ship
with libxenctrl.so (which we have deemed as unacceptable), or we can
make kexec perform run-time linking against libxenctrl.so.
This patch is one possible way to alleviate the explicit run-time
dependency on libxenctrl.so. This implementation utilizes a set of
macros to wrap calls into libxenctrl.so so that the library can
instead be dlopen() and obtain the function via dlsym() and then
make the call. The advantage of this implementation is that it
requires few changes to the existing kexec-tools code. The dis-
advantage is that it uses macros to remap libxenctrl functions
and do work under the hood.
Another possible implementation worth considering is the approach
taken by libvmi. Reference the following file:
https://github.com/libvmi/libvmi/blob/master/libvmi/driver/xen/libxc_wrapper.h
The libxc_wrapper_t structure definition that starts at line ~33
has members that are function pointers into libxenctrl.so. This
structure is populated once and then later referenced/dereferenced
by the callers of libxenctrl.so members. The advantage of this
implementation is it is more explicit in managing the use of
libxenctrl.so and its versions, but the disadvantage is it would
require touching more of the kexec-tools code.
The following is a list libxenctrl members utilized by kexec:
Functions:
xc_interface_open
xc_kexec_get_range
xc_interface_close
xc_kexec_get_range
xc_interface_open
xc_get_max_cpus
xc_kexec_get_range
xc_version
xc_kexec_exec
xc_kexec_status
xc_kexec_unload
xc_hypercall_buffer_array_create
xc__hypercall_buffer_array_alloc
xc_hypercall_buffer_array_destroy
xc_kexec_load
xc_get_machine_memory_map
Data:
xc__hypercall_buffer_HYPERCALL_BUFFER_NULL
These were identified by configuring and building kexec-tools
with Xen support, but omitting the -lxenctrl from the LDFLAGS
in the Makefile for an x86_64 build.
The above libxenctrl members were referenced via these source
files.
kexec/crashdump-xen.c
kexec/kexec-xen.c
kexec/arch/i386/kexec-x86-common.c
kexec/arch/i386/crashdump-x86.c
This patch provides a wrapper around the calls to the above
functions in libxenctrl.so. Every libxenctrl call must pass a
xc_interface which it obtains from xc_interface_open().
So the existing code is already structured in a manner that
facilitates graceful dlopen()'ing of the libxenctrl.so and
the subsequent dlsym() of the required member.
The patch creates a wrapper function around xc_interface_open()
and xc_interface_close() to perform the dlopen() and dlclose().
For the remaining xc_ functions, this patch defines a macro
of the same name which performs the dlsym() and then invokes
the function. See the __xc_call() macro for details.
There was one data item in libxenctrl.so that presented a
unique problem, HYPERCALL_BUFFER_NULL. It was only utilized
once, as
set_xen_guest_handle(xen_segs[s].buf.h, HYPERCALL_BUFFER_NULL);
I tried a variety of techniques but could not find a general
macro-type solution without modifying xenctrl.h. So the
solution was to declare a local HYPERCALL_BUFFER_NULL, and
this appears to work. I admit I am not familiar with libxenctrl
to state if this is a satisfactory workaround, so feedback
here welcome. I can state that this allows kexec to load/unload/kexec
on Xen and non-Xen environments that I've tested without issue.
With this patch applied, kexec-tools can be built with Xen
support and yet there is no explicit run-time dependency on
libxenctrl.so. Thus it can also be deployed in non-Xen
environments where libxenctrl.so is not installed.
# ldd build/sbin/kexec
linux-vdso.so.1 => (0x00007fff7dbcd000)
liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x00000038d9000000)
libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000)
libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000)
/lib64/ld-linux-x86-64.so.2 (0x0000562dc0c14000)
# build/sbin/kexec -v
kexec-tools 2.0.16
This feature/ability is enabled with the following:
./configure --with-xen=dl
The previous --with-xen=no and --with-xen=yes still work as before.
Not specifying a --with-xen still defaults to --with-xen=yes.
As I've introduced a new build and run-time mode, I've done an
extensive matrix of both build-time and run-time checks of kexec
with this patch applied. The set of build-time scenarios are:
1: configure --with-xen=no and Xen support NOT present
2: configure --with-xen=no and Xen support IS present
3: configure --with-xen=yes and Xen support NOT present
4: configure --with-xen=yes and Xen support IS present
5: configure --with-xen=dl and Xen support NOT present
6: configure --with-xen=dl and Xen support IS present
Xen support present requires that configure can find both
xenctrl.h and libxenctrl.so.
Then for each of the six scenarios above, the corresponding kexec
binary was tested on a Xen system (Oracle's OVS dom0) and a
non-Xen system (Oracle Linux).
There are two build-time checks: did kexec build, and did
it contain libxenctrl.so? The presence of libxenctrl.so
in kexec was checked via ldd. The results were:
Scenario | Build | libxenctrl.so | Result
1 | pass | no | pass - see Note 1
2 | pass | no | pass - see Note 1
3 | pass | no | pass - see Note 2
4 | pass | yes | pass - see Note 3
5 | pass | no | pass - see Note 2
6 | pass | no | pass - see Note 4
Note 1: This passes since due to --with-xen=no, there will
be no Xen support in kexec and therefore no libxenctrl.so a
in the kexec.
Note 2: This passes since while --with-xen=yes, the configure
displays a message indicating that Xen support is disabled,
and allows kexec to build (this is the same behavior as prior
to this patch). And since Xen support is disabled, there is
no libxenctrl.so in the kexec.
Note 3: This passes since with --with-xen=yes and configure
locating the xenctrl.h and libxenctrl.so, support for Xen was
built into kexec. Ldd shows an explicit dependency on the library.
Note 4: This passes since with --with-xen=dl and configure
locating the xenctrl.h and libxencrl.so, support for Xen
was built into kexec. However, this uses the new technique
introduced by this patch and, as a result, ldd shows that the
libxenctrl.so is not a explicit run-time dependency for kexec
(rather libdl.so is now an explicit dependency). This is
precisely the goal of this patch!
The net effect is that there are now three "flavors" of a kexec
binary (prior to this patch there were two): a) kexec with no
support for Xen [scenarios 1, 2, 3, 5], b) kexec with support
for Xen and libxenctrl.so as an explicit dependency [scenario 4],
and c) kexec with support for Xen and libxenctrl.so is NOT an
explicit dependency [scenario 6].
The run-time checks are to take each of the six scenarios above
and run the corresponding kexec binary on both a Xen system and
a non-Xen system. The test for each kexec scenario was:
% service kdump stop
% vi /etc/init.d/kdump
change KEXEC= to /sbin/kexec-[123456]
% service kdump start
# If not FAILED, then below
% service kdump status
Kdump is operational
% rm -fr /var/crash/*
% echo c > /proc/sysrq-trigger
# after reboot verify vmcore generated
% ls -al /var/crash/<tab>
The results were:
Scenario | Xen environment | non-Xen environment
1 | fail - see Note 5 | pass
2 | fail - see Note 5 | pass
3 | fail - see Note 6 | pass
4 | pass | fail - see Note 7
5 | fail - see Note 6 | pass
6 | pass | pass
Note 5: Due to --with-xen=no, kexec lacks support for Xen and will
fail in the Xen environment. This behavior is the same as prior
to this patch.
Note 6: Due to the missing xenctrl.h and libxenctrl.so, kexec was
built without support for Xen, and thus will fail in the Xen
environment. This behavior is the same as prior to this patch.
Note 7: This kexec has the explicit dependency on libxenctrl.so
which prevents it from running in a non-Xen environment. This is
expected as this is the original issue for which this patch is
intended to address.
Note that for scenarios 1, 2, 3 and 5 kexec lacks support for Xen,
thus these versions are expected to "fail" in a Xen environment.
On the flip side, since a non-Xen environment does not need
libxenctrl.so, all but scenario 4 are expected to "pass" in a
non-Xen environment. The results match these expectations!
And, of course, importantly with this patch applied, it did not
have an adverse impact on kexec build or run-time.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Read the new extension data which tells the boot agent about the
requirements for booting the kernel image, such as how much RAM
will be consumed by the kernel through decompression and booting.
This is necessary to control the placement of the DTB and
compressed RAM disk to avoid these objects being corrupted.
Tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
There is no difference in the way the initrd is handled between an
ATAG-based kernel and a DTB-based kernel. Therefore, this should be
handled identically in both cases.
Rearrange the code to achieve this.
Signed-off-by: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
The OCTEON family of MIPS64 CPUs uses a PAGE_OFFSET of
0x8000000000000000ULL, which is differs from other CPUs.
Scan /proc/cpuinfo to see if the current system is "Octeon", if so,
patch the page_offset so that usable kdump core files are produced.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Some kernel versions running on MIPS split the System RAM memory
regions reported in /proc/iomem. This may cause loading of the kexec
kernel to fail if it crosses one of the splits.
Fix by merging adjacent memory ranges that have the same type.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
The kernel message buffers, as well as a lot of other useful data
reside in the bss section. Without this vmcore-dmesg cannot work, and
debugging with a core dump is much more difficult.
Try to add the /proc/iomem "Kernel bss" section to vmcore. If it is
not found, just do what we used to do and use "Kernel data" instead.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
The 64-bit MIPS architecture doesn't have the same 2G limit the 32-bit
version has. Set MAXMEM and lowmem_limit to 0 for 64-bit MIPS so that
memory above 2G is usable in the kdump core files.
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Accelerator devices like GPU and FPGA cards contain onboard memory. This
onboard memory is represented as a memory only NUMA node, integrating it
with core memory subsystem. Since, the link through which these devices
are integrated to core memory goes down after a system crash and they are
meant for user workloads, avoid adding coherent device memory regions to
crash memory ranges. Without this change, makedumpfile tool tries to save
unaccessible coherent device memory regions, crashing the system.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Tested-by: Pingfan Liu <piliu@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Since kernel commit a5980d064fe2 ("powerpc: Bump COMMAND_LINE_SIZE
to 2048"), powerpc bumped command line size to 2048 but the size
used here is still the default value of 512. Bump it to 2048 to
fix command line overflow errors observed when command line length
is above 512 bytes. Also, get rid of the multiple definitions of
COMMAND_LINE_SIZE macro in ppc architecture.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Hang was observed, in purgatory, on a machine configured with
single LPAR. This was because one of the segments was loaded
outside the actual Real Memory Area (RMA) due to wrongly
deduced RMA top value.
Currently, top of real memory area, which is crucial for loading
kexec/kdump kernel, is obtained by iterating through mem nodes
and setting its value based on the base and size values of the
last mem node in the iteration. That can't always be correct as
the order of iteration may not be same and RMA base & size are
always based on the first memory property. Fix this by setting
RMA top value based on the base and size values of the memory
node that has the smallest base value (first memory property)
among all the memory nodes.
Also, correct the misnomers rmo_base and rmo_top to rma_base
and rma_top respectively.
While how RMA top is deduced was broken for sometime, the issue
may not have been seen so far, for couple of possible reasons:
1. Only one mem node was available.
2. First memory property has been the last node in
iteration when multiple mem nodes were present.
Fixes: 02f4088ffded ("kexec fix ppc64 device-tree mem node")
Reported-by: Ankit Kumar <ankit@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Geoff Levand <geoff@infradead.org>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
This patch adds support to use binary image ie arch/arm64/boot/Image with
kdump.
Signed-off-by: Pratyush Anand <panand@redhat.com>
[takahiro.akashi@linaro.org: a bit reworked]
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Tested-by: David Woodhouse <dwmw@amazon.co.uk>
Tested-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
We pass the following properties to crash dump kernel:
linux,elfcorehdr: elf core header segment,
same as "elfcorehdr=" kernel parameter on other archs
linux,usable-memory-range: usable memory reserved for crash dump kernel
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Tested-by: David Woodhouse <dwmw@amazon.co.uk>
Tested-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
We make sure that all the other segments, initrd and device-tree blob,
also be loaded into the reserved memory of crash dump kernel.
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Tested-by: David Woodhouse <dwmw@amazon.co.uk>
Tested-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|