summaryrefslogtreecommitdiff
path: root/kexec/arch/i386
AgeCommit message (Collapse)Author
2023-02-23x86: add devicetree supportJulian Winkler
Since linux kernel has dropped support for simple firmware interface (SFI), the only way of boot newer versions on intel MID platform is using devicetree Signed-off-by: Julian Winkler <julian.winkler1@web.de> Signed-off-by: Simon Horman <horms@kernel.org>
2022-07-19kexec-tools: Remove duplicate ultoa() definitions and redefine itTiezhu Yang
There exist duplicate ultoa() definitions in many archs, remove them, and also redefine ultoa() in kexec/kexec.h to make it more readable. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Simon Horman <horms@kernel.org>
2022-07-15i386: pass rng seed via setup_dataJason A. Donenfeld
Linux ≥5.20 expects a RNG seed via setup_data as of the upstream commit in the link below. That commit adjusts kexec_file_load to pass SETUP_RNG_SEED. kexec-tools should follow suite, so add more or less the same code here. Link: https://git.kernel.org/tip/tip/c/68b8e9713c8 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Simon Horman <horms@kernel.org>
2021-10-20kexec-tools: multiboot2: Correct BASIC_MEMINFO memory unitsTu Dinh
mem_lower and mem_upper are measured in kilobytes. Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-24i386: Remove unused local variable in get_kernel_page_offset()Kai Song
In get_kernel_page_offset(),the local variable kv is unused,remove it. Signed-off-by: Kai Song <songkai01@inspur.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14multiboot2: Accept x86-64 imagesZhaofeng Li
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14multiboot2: Avoid first 0x500 bytesZhaofeng Li
In some cases, add_buffer will actually try to allocate the buffer at 0x0, which may not be acceptable by some kernels. Let's avoid the first 0x500 bytes so we don't screw up the IVT and BDA. Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14multiboot2: Use rel_min and rel_max for buffer destinationsZhaofeng Li
This would segfault if mhi.rel_tag didn't exist. Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14multiboot2: Correct MBI size calculationZhaofeng Li
tag_load_base_addr is dependent on rel_tag, and tag_framebuffer was not accounted for. Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-09-14x86: Consolidate elf_x86_probe routinesZhaofeng Li
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-17kexec: Remove the error prone kernel_version functionEric W. Biederman
During kexec there are two kernel versions at play. The version of the running kernel and the version of the kernel that will be booted. On powerpc it appears people have been using the version of the running kernel to attempt to detect properties of the kernel to be booted which is just wrong. As the linux kernel version that is being detected is a no longer supported kernel just remove that buggy and confused code. On x86_64 the kernel_version is used to compute the starting virtual address of the running kernel so a proper core dump may be generated. Using the kernel_version stopped working a while ago when the starting virtual address became randomized. The old code was kept for the case where the kernel was not built with randomization support, but there is nothing in reading /proc/kcore that won't work to detect the starting virtual address even there. In fact /proc/kcore must have the starting virtual address or a debugger can not make sense of the running kernel. So just make computing the starting virtual address on x86_64 unconditional. With a hard coded fallback just in case something went wrong. Doing something with kernel_version() has become important as recent stable kernels have seen the minor version to > 255. Just removing kernel_version() looks like the best option. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-07Shrink segments to fit alignment instead of throwing them awayHongyan Xia
We risk throwing an entire large chunk away if it is just slightly unaligned which then causes the crash kernel to run out of RAM. Keep them and shrink them to alignment. Signed-off-by: Hongyan Xia <hongyxia@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-07Fix where the real mode interrupt vector endsHongyan Xia
The real mode ends at 0x400, not 0x100. The code intentionally excludes the IVT as RAM, so use the correct address. Also, 0x100 is not 1K aligned and will be rejected by add_memmap(). We have observed problems that after a multiboot2 kexec, the next kexec will throw away such unaligned chunks, losing memory for the next next kernel. In some corner cases, such loss of memory can actually cause OOM during boot. Signed-off-by: Hongyan Xia <hongyxia@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02crashdump/x86: increase CRASH_MAX_MEMORY_RANGES to 32kDavid Hildenbrand
virtio-mem in Linux adds/removes individual memory blocks (e.g., 128 MB each). Linux merges adjacent memory blocks added by virtio-mem devices, but we can still end up with a very sparse memory layout when unplugging memory in corner cases. Let's increase the maximum number of crash memory ranges from ~2k to 32k. 32k should be sufficient for a very long time. e_phnum field in the header is 16 bits wide, so we can fit a maximum of ~64k entries in there, shared with other entries (i.e., CPU). Therefore, using up to 32k memory ranges is fine. (if we ever need more than ~64k, we can switch to the sh_info field) Move the temporary xen ranges off the stack, dynamically allocating memory for them. Note: We don't have to increase MAX_MEMORY_RANGES, because virtio-mem added memory is driver managed and always detected and added by a driver in the kexec'ed kernel; for ordinary kexec, we must not expose these ranges in the firmware-provided memmap. Cc: Simon Horman <horms@verge.net.au> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02crashdump/x86: iterate only over actual crash memory rangesDavid Hildenbrand
No need to iterate over empty entries. Cc: Simon Horman <horms@verge.net.au> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02crashdump/x86: dump any kind of "System RAM"David Hildenbrand
Traditionally, we had "System RAM" only on the top level of in the kernel resource tree (-> /proc/iomem). Nowadays, we can also have "System RAM" on lower levels of the tree -- driver-managed device memory that is always detected and added via drivers. Current examples are memory added via dax/kmem -- ("System RAM (kmem)") and virtio-mem ("System RAM (virtio_mem)"). Note that in some kernel versions "System RAM (kmem)" was exposed as "System RAM", but similarly, on lower levels of the resource tree. Let's add anything that contains "System RAM" to the elf core header, so it will be dumped for kexec_load(). Handling kexec_file_load() in the kernel is similarly getting fixed [1]. Loading a kdump kernel via "kexec -p -c" ... will result in the kdump kernel to also dump dax/kmem and virtio-mem added System RAM now. Note: We only want to dump this memory, we don't want to add this memory to the memmap of an ordinary kexec'ed kernel ("fast system reboot"). [1] https://lkml.kernel.org/r/20210322160200.19633-1-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2021-04-02i386: fix build on pre 4.4 kernelsFederico Pellegrin
kexec build will fail on older kernels (pre 4.4) as the define VIDEO_CAPABILITY_64BIT_BASE was not present at that time. This patch adds it, as per linux/include/uapi/linux/screen_info.h, if not present. Signed-off-by: Federico Pellegrin <fede@evolware.org> Reviewed-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-11-16i386: fix string formatting-related warningsAhelenia Ziemiańska
fixed the same way as in 70cca82 "kexec: Fix snprintf related compilation warnings" Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-11-16i386/kexec-mb2-x86.c: cast ints to uintptr_t before pointers to avoid warningsAhelenia Ziemiańska
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-09-29kexec-tools: Add some missing free() callsYouling Tang
Add some missing free() calls. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-07kexec: Introduce --load-live-update for xenVarad Gautam
Support loading a live update image for xen from kexec userspace. For a multiboot2 Elf on a xen setup, this will: - load the Elf into KEXEC_RANGE_MA_XEN - load purgatory and modules into KEXEC_RANGE_MA_LIVEUPDATE - append the Elf cmdline with " liveupdate=<size>@<addr> v2: define xen related symbols outside of HAVE_LIBXENCTRL Signed-off-by: Varad Gautam <vrd@amazon.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-04-01kexec: support parsing the string "Reserved" to get the correct e820 ↵Lianbo Jiang
reserved region When loading kernel and initramfs for kexec, kexec-tools could get the e820 reserved region from "/proc/iomem" in order to rebuild the e820 ranges for kexec kernel, but there may be the string "Reserved" in the "/proc/iomem", which caused the failure of parsing. For example: #cat /proc/iomem|grep -i reserved 00000000-00000fff : Reserved 7f338000-7f34dfff : Reserved 7f3cd000-8fffffff : Reserved f17f0000-f17f1fff : Reserved fe000000-ffffffff : Reserved Currently, kexec-tools can not handle the above case because the memcmp() is case sensitive when comparing the string. So, let's fix this corner and make sure that the string "reserved" and "Reserved" in the "/proc/iomem" are both parsed appropriately. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Acked-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2020-01-03kexec: build multiboot2 for i386Chris Packham
This addresses the following compilation issues when building for i386. kexec/arch/i386/kexec-x86.c:39:22: error: 'multiboot2_x86_probe' undeclared here (not in a function); did you mean 'multiboot_x86_probe'? { "multiboot2-x86", multiboot2_x86_probe, multiboot2_x86_load, ^~~~~~~~~~~~~~~~~~~~ multiboot_x86_probe kexec/arch/i386/kexec-x86.c:39:44: error: 'multiboot2_x86_load' undeclared here (not in a function); did you mean 'multiboot_x86_load'? { "multiboot2-x86", multiboot2_x86_probe, multiboot2_x86_load, ^~~~~~~~~~~~~~~~~~~ multiboot_x86_load kexec/arch/i386/kexec-x86.c:40:4: error: 'multiboot2_x86_usage' undeclared here (not in a function); did you mean 'multiboot_x86_usage'? multiboot2_x86_usage }, ^~~~~~~~~~~~~~~~~~~~ multiboot_x86_usage make: *** [Makefile:114: kexec/arch/i386/kexec-x86.o] Error 1 make: *** Waiting for unfinished jobs.... Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-09-16i386/kexec-mb2-x86.c: Fix compilation warningBhupesh Sharma
This patch fixes the following compilation warning in 'i386/kexec-mb2-x86.c' regarding the variable 'result' which is set but not used: kexec/arch/i386/kexec-mb2-x86.c:402:6: warning: variable ‘result’ set but not used [-Wunused-but-set-variable] int result; ^~~~~~ Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-09-03x86: Fix PAGE_OFFSET for kernels since 4.20Donald Buczek
Linux kernel commit d52888aa2753 ("x86/mm: Move LDT remap out of KASLR region on 5-level paging") changed the base of the direct mapping from 0xffff880000000000 to 0xffff888000000000. This was merged into v4.20-rc2. Update to new address accordingly. Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-10x86: Include kexec-mb2-x86.c and multiboot2.h in distributionSimon Horman
Fixes: 22a2ed55132e ("x86: Support multiboot2 images") Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-03x86: re-order includes to avoid duplicate struct e820entrySimon Horman
xenctrl.h defines struct e820entry as: if defined(__i386__) || defined(__x86_64__) ... #define E820_RAM 1 ... struct e820entry { uint64_t addr; uint64_t size; uint32_t type; } __attribute__((packed)); ... #endif $ dpkg-query -S /usr/include/xenctrl.h libxen-dev:amd64: /usr/include/xenctrl.h $ dpkg-query -W libxen-dev:amd64 libxen-dev:amd64 4.8.5+shim4.10.2+xsa282-1+deb9u11 ./include/x86/x86-linux.h defines struct e820entry as: #ifndef E820_RAM struct e820entry { uint64_t addr; /* start of memory segment */ uint64_t size; /* size of memory segment */ uint32_t type; /* type of memory segment */ #define E820_RAM 1 ... } __attribute__((packed)); #endif Since cedeee0a3007 ("x86: Introduce helpers for getting RSDP address") ./kexec/arch/i386/kexec-x86-common.c includes +#include "x86-linux-setup.h" #include "../../kexec-xen.h" When xenctrl.h is present the above results in: $ gcc ... In file included from kexec/arch/i386/../../kexec-xen.h:5:0, from kexec/arch/i386/kexec-x86-common.c:43: /usr/include/xenctrl.h:1271:8: error: redefinition of 'struct e820entry' struct e820entry { ^~~~~~~~~ In file included from kexec/arch/i386/x86-linux-setup.h:3:0, from kexec/arch/i386/kexec-x86-common.c:42: ./include/x86/x86-linux.h:16:8: note: originally defined here struct e820entry { ^~~~~~~~~ ... $ gcc --version | head -1 gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 To militate this this problem re-order the includes so that x86-linux.h is included after xenctrl.h and thus struct e820entry will only be defined once due to it being devined conditionally in x86-linux.h. In practice the definitions are the same so it should not matter which is chosen. It also seems rather unpleasent to me to need to play with include ordering. Perhaps a better solution in the longer term would be to rename the local definition of struct e820entry. Fixes: cedeee0a3007 ("x86: Introduce helpers for getting RSDP address") Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-03x86: Support multiboot2 imagesVarad Gautam
Add a new type `multiboot2-x86` that allows loading multiboot2 [1] images within the relocation range specified in the image header. The image is always placed at the lowest available address, regardless of the preference information. [1] https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html Signed-off-by: Varad Gautam <vrd@amazon.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31crashdump/x86: Use new introduce helper for getting RSDPKairui Song
Use the new introduce helper for getting RSDP, this ensures RSDP is always accessible and avoid code duplication. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31x86: Always try to fill acpi_rsdp_addr in boot paramsKairui Song
Since kernel commit e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP address from boot params if available"), kernel accept an acpi_rsdp_addr param in boot_params. So fill in this parameter unconditionally, ensure second kernel always get the right RSDP address consistently, and boot well on EFI system even with EFI service disabled. User no longer need to change the kernel cmdline to workaround the missing RSDP issue. For older version of kernels (Before 5.0), there won't be any change of behavior. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31x86: Introduce helpers for getting RSDP addressKairui Song
On x86 RSDP is fundamental for booting the machine. When second kernel is incapable of parsing the RSDP address (eg. kexec next kernel on an EFI system with EFI service disabled), kexec should prepare the RSDP address for second kernel. Introduce helpers for getting RSDP from multiple sources, including boot params and EFI firmware. For legacy BIOS interface, there is no better way to find the RSDP address rather than scanning the memory region and search for it, and this will always be done by the kernel as a fallback, so this is no need to try to get the RSDP address for that case. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-15x86: Find mounts by FS type, not nameNiklas Hambüchen
The name in mount invocations like mount -t debugfs debugfs /sys/kernel/debug is nothing but convention and cannot be relied upon. For example, https://www.kernel.org/doc/Documentation/filesystems/debugfs.txt recommends making the name "none" instead: mount -t debugfs none /sys/kernel/debug and many existing systems use mounts named "none" or otherwise. Using `mnt_type` instead of `mnt_fsname` allows kexec to work on such systems. This fixes another instance of `poweroff` not working on kexec'ed kernels because the lack of correctly matched mount results in EFI variables not being read and propagated. Signed-off-by: Niklas Hambüchen <mail@nh2.me> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-15x86: Check /proc/mounts before mtab for mountsNiklas Hambüchen
In many situations, especially on read-only file systems and initial ramdisks (intramfs/initrd), /etc/mtab does not exist. Before this commit, kexec would fail to read mounts on such systems in `find_mnt_by_fsname()`, such that `get_bootparam()` would not `boot_params/data`, which would then lead to e.g. `setup_efi_data()` not being called in `setup_efi_info()`. As a result, kexec'ed kernels would not obtain EFI data, subsequentially lack an `ACPI RSDP` entry, emitting: ACPI BIOS Error (bug): A valid RSDP was not found (20180810/tbxfroot-210) and thus fail to turn off the machine on poweroff, instead printing only: reboot: System halted This problem had to be worked around by passing `acpi_rsdp=` manually before. This commit obviates this workaround. See also: * https://github.com/coreos/bugs/issues/167#issuecomment-487320879 * http://lists.infradead.org/pipermail/kexec/2012-October/006924.html Signed-off-by: Niklas Hambüchen <mail@nh2.me> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-03-06x86: Introduce a new option --reuse-video-typeKairui Song
After commit 060eee58 "x86: use old screen_info if needed", kexec-tools will force use old screen_info and vga type if failed to determine current vga type. But it is not always a good idea. Currently kernel hanging is inspected on some hyper-v VMs after this commit, because hyperv_fb will mimic EFI (or VESA) VGA on first boot up, but after the real driver is loaded, it will switch to new mode and no longer compatible with EFI/VESA VGA. Keep setting orig_video_isVGA to EFI/VESA VGA flag will get wrong driver loaded and try to manipulate the framebuffer in a wrong way. We can't ensure this won't happen on other framebuffer drivers, But it's a helpful feature if the framebuffer drivers just work. So this patch introduce a --reuse-video-type options to let user decide if the old screen_info hould be used unconditional or not. Signed-off-by: Kairui Song <kasong@redhat.com> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28x86: Handle 64bit framebuffer memory address properlyKairui Song
In a EFI system, the frame buffer address is 64bit, so currently if the address is beyound 4G, kexec will set wrong address due to truncate. Linux kernel commit ae2ee627dc87 ('efifb: Add support for 64-bit frame buffer addresses') added support for 64bit frame buffer address, an 'ext_lfb_base' field is added as the upper 32-bits of the frame buffer, and introduced a new capability flag 'VIDEO_TYPE_CAPABILITY_64BIT_BASE' to indicate if the extend field is used. This patch adopts this change, set proper extent address and capability flag when the address is beyound 4G. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: pass framebuffer information when requestedFriedemann Gerold
When the kernel requests video information, pass it the framebuffer information in the multiboot header from the linux framebuffer ioctl's. With the arch specific --reset-vga or --consolve-vga options, purgatory will reset the framebuffer so pass information for standard ega text mode. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: pass ACPI reserved memory information in memory mapFriedemann Gerold
Use the appropriate types for ACPI reclaim and ACPI NVS ranges in the multiboot memory map. This allows the kernel to locate ACPI tables on UEFI systems without having a explicit pointer to the RSD. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: support for non-elf kernelsFriedemann Gerold
Add support for non-elf multiboot kernels (such as Plan 9) by handling the MULTIBOOT_AOUT_KLUDGE bit. When the bit is clear then we are dealing with an ELF file and probe for ELF as before with elf_x86_probe(). When the bit is set then load_addr, load_end_addr, header_addr and entry_addr from the multiboot header are used load the memory image. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-10-29x86: fix BAD_FREE in get_efi_runtime_map()Pingfan Liu
If the err_out label is reached, address of a stack variable is passed to free(). Fix it. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-10-02kdump: fix an error that can not parse the e820 reserved regionLianbo Jiang
When kexec-tools load the kernel and initramfs for kdump, kexec-tools will read /proc/iomem and recreate the e820 ranges for kdump kernel. But it fails to parse the e820 reserved region, because the memcmp() is case sensitive when comparing the string. In fact, it may be "Reserved" or "reserved" in the /proc/iomem, so we have to fix these cases. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-29x86: use old screen_info if neededDave Young
With modern drm/kms graphic driver kexec-tools does not setup screen_info correctly so one will only see screen output after those drm drivers reinitializing after rebooting. Copying the old screen info from original boot_params will help during my test, although it could not work for some potential cases, but it is not worse than before. This has been used in the kernel kexec_file_load. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-24kexec-tools: Perform run-time linking of libxenctrl.soEric DeVolder
When kexec is utilized in a Xen environment, it has an explicit run-time dependency on libxenctrl.so. This dependency occurs during the configure stage and when building kexec-tools. When kexec is utilized in a non-Xen environment (either bare metal or KVM), the configure and build of kexec-tools omits any reference to libxenctrl.so. Thus today it is not currently possible to configure and build a *single* kexec that will work in *both* Xen and non-Xen environments, unless the libxenctrl.so is *always* present. For example, a kexec configured for Xen in a Xen environment: # ldd build/sbin/kexec linux-vdso.so.1 => (0x00007ffdeba5c000) libxenctrl.so.4.4 => /usr/lib64/libxenctrl.so.4.4 (0x00000038d8000000) libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000) libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000) libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000) /lib64/ld-linux-x86-64.so.2 (0x000055e9f8c6c000) # build/sbin/kexec -v kexec-tools 2.0.16 However, the *same* kexec executable fails in a non-Xen environment: # copy xen kexec to . # ldd ./kexec linux-vdso.so.1 => (0x00007fffa9da7000) libxenctrl.so.4.4 => not found liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x0000003014e00000) libz.so.1 => /lib64/libz.so.1 (0x000000300ea00000) libc.so.6 => /lib64/libc.so.6 (0x000000300de00000) libpthread.so.0 => /lib64/libpthread.so.0 (0x000000300e200000) /lib64/ld-linux-x86-64.so.2 (0x0000558cc786c000) # ./kexec -v ./kexec: error while loading shared libraries: libxenctrl.so.4.4: cannot open shared object file: No such file or directory At Oracle we "workaround" this by having two kexec-tools packages, one for Xen and another for non-Xen environments. At Oracle, the desire is to offer a single kexec-tools package that works in either environment. To achieve this, kexec-tools would either have to ship with libxenctrl.so (which we have deemed as unacceptable), or we can make kexec perform run-time linking against libxenctrl.so. This patch is one possible way to alleviate the explicit run-time dependency on libxenctrl.so. This implementation utilizes a set of macros to wrap calls into libxenctrl.so so that the library can instead be dlopen() and obtain the function via dlsym() and then make the call. The advantage of this implementation is that it requires few changes to the existing kexec-tools code. The dis- advantage is that it uses macros to remap libxenctrl functions and do work under the hood. Another possible implementation worth considering is the approach taken by libvmi. Reference the following file: https://github.com/libvmi/libvmi/blob/master/libvmi/driver/xen/libxc_wrapper.h The libxc_wrapper_t structure definition that starts at line ~33 has members that are function pointers into libxenctrl.so. This structure is populated once and then later referenced/dereferenced by the callers of libxenctrl.so members. The advantage of this implementation is it is more explicit in managing the use of libxenctrl.so and its versions, but the disadvantage is it would require touching more of the kexec-tools code. The following is a list libxenctrl members utilized by kexec: Functions: xc_interface_open xc_kexec_get_range xc_interface_close xc_kexec_get_range xc_interface_open xc_get_max_cpus xc_kexec_get_range xc_version xc_kexec_exec xc_kexec_status xc_kexec_unload xc_hypercall_buffer_array_create xc__hypercall_buffer_array_alloc xc_hypercall_buffer_array_destroy xc_kexec_load xc_get_machine_memory_map Data: xc__hypercall_buffer_HYPERCALL_BUFFER_NULL These were identified by configuring and building kexec-tools with Xen support, but omitting the -lxenctrl from the LDFLAGS in the Makefile for an x86_64 build. The above libxenctrl members were referenced via these source files. kexec/crashdump-xen.c kexec/kexec-xen.c kexec/arch/i386/kexec-x86-common.c kexec/arch/i386/crashdump-x86.c This patch provides a wrapper around the calls to the above functions in libxenctrl.so. Every libxenctrl call must pass a xc_interface which it obtains from xc_interface_open(). So the existing code is already structured in a manner that facilitates graceful dlopen()'ing of the libxenctrl.so and the subsequent dlsym() of the required member. The patch creates a wrapper function around xc_interface_open() and xc_interface_close() to perform the dlopen() and dlclose(). For the remaining xc_ functions, this patch defines a macro of the same name which performs the dlsym() and then invokes the function. See the __xc_call() macro for details. There was one data item in libxenctrl.so that presented a unique problem, HYPERCALL_BUFFER_NULL. It was only utilized once, as set_xen_guest_handle(xen_segs[s].buf.h, HYPERCALL_BUFFER_NULL); I tried a variety of techniques but could not find a general macro-type solution without modifying xenctrl.h. So the solution was to declare a local HYPERCALL_BUFFER_NULL, and this appears to work. I admit I am not familiar with libxenctrl to state if this is a satisfactory workaround, so feedback here welcome. I can state that this allows kexec to load/unload/kexec on Xen and non-Xen environments that I've tested without issue. With this patch applied, kexec-tools can be built with Xen support and yet there is no explicit run-time dependency on libxenctrl.so. Thus it can also be deployed in non-Xen environments where libxenctrl.so is not installed. # ldd build/sbin/kexec linux-vdso.so.1 => (0x00007fff7dbcd000) liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x00000038d9000000) libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000) libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000) libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000) /lib64/ld-linux-x86-64.so.2 (0x0000562dc0c14000) # build/sbin/kexec -v kexec-tools 2.0.16 This feature/ability is enabled with the following: ./configure --with-xen=dl The previous --with-xen=no and --with-xen=yes still work as before. Not specifying a --with-xen still defaults to --with-xen=yes. As I've introduced a new build and run-time mode, I've done an extensive matrix of both build-time and run-time checks of kexec with this patch applied. The set of build-time scenarios are: 1: configure --with-xen=no and Xen support NOT present 2: configure --with-xen=no and Xen support IS present 3: configure --with-xen=yes and Xen support NOT present 4: configure --with-xen=yes and Xen support IS present 5: configure --with-xen=dl and Xen support NOT present 6: configure --with-xen=dl and Xen support IS present Xen support present requires that configure can find both xenctrl.h and libxenctrl.so. Then for each of the six scenarios above, the corresponding kexec binary was tested on a Xen system (Oracle's OVS dom0) and a non-Xen system (Oracle Linux). There are two build-time checks: did kexec build, and did it contain libxenctrl.so? The presence of libxenctrl.so in kexec was checked via ldd. The results were: Scenario | Build | libxenctrl.so | Result 1 | pass | no | pass - see Note 1 2 | pass | no | pass - see Note 1 3 | pass | no | pass - see Note 2 4 | pass | yes | pass - see Note 3 5 | pass | no | pass - see Note 2 6 | pass | no | pass - see Note 4 Note 1: This passes since due to --with-xen=no, there will be no Xen support in kexec and therefore no libxenctrl.so a in the kexec. Note 2: This passes since while --with-xen=yes, the configure displays a message indicating that Xen support is disabled, and allows kexec to build (this is the same behavior as prior to this patch). And since Xen support is disabled, there is no libxenctrl.so in the kexec. Note 3: This passes since with --with-xen=yes and configure locating the xenctrl.h and libxenctrl.so, support for Xen was built into kexec. Ldd shows an explicit dependency on the library. Note 4: This passes since with --with-xen=dl and configure locating the xenctrl.h and libxencrl.so, support for Xen was built into kexec. However, this uses the new technique introduced by this patch and, as a result, ldd shows that the libxenctrl.so is not a explicit run-time dependency for kexec (rather libdl.so is now an explicit dependency). This is precisely the goal of this patch! The net effect is that there are now three "flavors" of a kexec binary (prior to this patch there were two): a) kexec with no support for Xen [scenarios 1, 2, 3, 5], b) kexec with support for Xen and libxenctrl.so as an explicit dependency [scenario 4], and c) kexec with support for Xen and libxenctrl.so is NOT an explicit dependency [scenario 6]. The run-time checks are to take each of the six scenarios above and run the corresponding kexec binary on both a Xen system and a non-Xen system. The test for each kexec scenario was: % service kdump stop % vi /etc/init.d/kdump change KEXEC= to /sbin/kexec-[123456] % service kdump start # If not FAILED, then below % service kdump status Kdump is operational % rm -fr /var/crash/* % echo c > /proc/sysrq-trigger # after reboot verify vmcore generated % ls -al /var/crash/<tab> The results were: Scenario | Xen environment | non-Xen environment 1 | fail - see Note 5 | pass 2 | fail - see Note 5 | pass 3 | fail - see Note 6 | pass 4 | pass | fail - see Note 7 5 | fail - see Note 6 | pass 6 | pass | pass Note 5: Due to --with-xen=no, kexec lacks support for Xen and will fail in the Xen environment. This behavior is the same as prior to this patch. Note 6: Due to the missing xenctrl.h and libxenctrl.so, kexec was built without support for Xen, and thus will fail in the Xen environment. This behavior is the same as prior to this patch. Note 7: This kexec has the explicit dependency on libxenctrl.so which prevents it from running in a non-Xen environment. This is expected as this is the original issue for which this patch is intended to address. Note that for scenarios 1, 2, 3 and 5 kexec lacks support for Xen, thus these versions are expected to "fail" in a Xen environment. On the flip side, since a non-Xen environment does not need libxenctrl.so, all but scenario 4 are expected to "pass" in a non-Xen environment. The results match these expectations! And, of course, importantly with this patch applied, it did not have an adverse impact on kexec build or run-time. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-05-22kexec: generalize and rename get_kernel_stext_sym()Pratyush Anand
get_kernel_stext_sym() has been defined for both arm and i386. Other architecture might need some other kernel symbol address. Therefore rewrite this function as generic function to get any kernel symbol address. More over, kallsyms is not arch specific representation, therefore have common function for all arches. Signed-off-by: Pratyush Anand <panand@redhat.com> [created symbols.c] Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-24x86: Support large number of memory rangesXunlei Pang
We got a problem on one SGI 64TB machine, the current kexec-tools failed to work due to the insufficient ranges(MAX_MEMORY_RANGES) allowed which is defined as 1024(less than the ranges on the machine). The kcore header is insufficient due to the same reason as well. To solve this, this patch simply doubles "MAX_MEMORY_RANGES" and "KCORE_ELF_HEADERS_SIZE". Signed-off-by: Xunlei Pang <xlpang@redhat.com> Tested-by: Frank Ramsay <frank.ramsay@hpe.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-14x86/x86_64: Fix format warning with die()Pratyush Anand
Fedora koji uses gcc version 7.0.1-0.12.fc27, and it generates a build warning kexec/arch/i386/kexec-elf-x86.c:299:3: error: format not a string literal and no format arguments [-Werror=format-security] die(error_msg); ^~~ cc1: some warnings being treated as errors error_msg can have a format specifier as well in string. In such cases, if there is no other arguments for the format variable then code will try to access a non existing argument. Therefore, use 1st argument as format specifier for string print and pass error_msg as the string to be printed. While doing that,also use const qualifier before "char *error_msg". Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-14Don't use %L width specifier with integer valuesPhilip Prindeville
MUSL doesn't support %L except for floating-point arguments; therefore, %ll must be used instead with integer arguments. Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-10Only print debug message when failed to serach for kernel symbol from ↵Baoquan He
/proc/kallsyms Kernel symbol page_offset_base could be unavailable when mm KASLR code is not compiled in kernel. It's inappropriate to print out error message when failed to search for page_offset_base from /proc/kallsyms. Seems now there is not a way to find out if mm KASLR is compiled in or not. An alternative approach is only printing out debug message in get_kernel_sym if failed to search a expected kernel symbol. Do it in this patch, a simple fix. Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Pratyush Anand <panand@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-12-08kexec-tools/x86: get_kernel_vaddr_and_size off-by-one fixDave Young
I got below error while tesing kexec -p: "Can't find kernel text map area from kcore" The case is the pt_load start addr was same as stext_sym. The checking code should really be saddr <= stext_sym so that the right pt_load area includes stext_sym can be matched. This was not reported by people previously because it will fail over to use hardcode X86_64__START_KERNEL_map to match the pt_load areas again in later code and it sometimes succeeds because of kernel address randomization. With this change according to my test stext_sym checking can garantee falling into right pt_load area if we get correct stext_sym. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-10-07multiboot: Use the "reserved" type for non-ram zonesSylvain Munaut
Seems that Xen actually checks for some zones to be 'reserved' and complains if they are not. This also matches what the bios uses at boot. Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-10-07multiboot: Fix length computation for the memory zonesSylvain Munaut
Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: Simon Horman <horms@verge.net.au>