summaryrefslogtreecommitdiff
path: root/kexec/arch
AgeCommit message (Collapse)Author
2019-07-16kexec/arm64: Add support for handling zlib compressed (Image.gz) imageBhupesh Sharma
Currently the kexec_file_load() support for arm64 doesn't allow handling zlib compressed (i.e. Image.gz) image. Since most distributions use 'make zinstall' rule inside 'arch/arm64/boot/Makefile' to install the arm64 Image.gz compressed file inside the boot destination directory (for e.g. /boot), currently we cannot use kexec_file_load() to load vmlinuz (or Image.gz): # file /boot/vmlinuz /boot/vmlinuz: gzip compressed data, was "Image", <..snip..>, max compression, from Unix, original size 21945120 Now, since via kexec_file_load() we pass the 'fd' of Image.gz (compressed file) via the following command line ... # kexec -s -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline ... kernel returns -EINVAL error value, as it is not able to locate the magic number =0x644d5241, which is expected in the 64-byte header of the decompressed kernel image. We can fix this in user-space kexec-tools, which handles an 'Image.gz' being passed via kexec_file_load(), using an approach as follows: a). Copy the contents of Image.gz to a temporary file. b). Decompress (gunzip-decompress) the contents inside the temporary file. c). Pass the 'fd' of the temporary file to the kernel space. So basically the kernel space still gets a decompressed kernel image to load via kexec-tools I tested this patch for the following three use-cases: 1. Uncompressed Image file: #kexec -s -l Image --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline 2. Signed Image file: #kexec -s -l Image.signed --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline 3. zlib compressed Image.gz file: #kexec -s -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-16kexec-uImage-arm64.c: Fix return value of uImage_arm64_probe()Bhupesh Sharma
Commit bf06cf2095e1 ("kexec/uImage: probe to identify a corrupted image"), defined the 'uImage_probe_kernel()' function return values and correspondingly ;uImage_arm64_probe()' returns the same (0 -> If the image is valid 'type' image, -1 -> If the image is corrupted and 1 -> If the image is not a uImage). This causes issues because, in later patches we introduce zImage support for arm64, and since it is probed after uImage, the return values from 'uImage_arm64_probe()' needs to be fixed to make sure that kexec will not return with an invalid error code. Now, 'uImage_arm64_probe()' returns the following values instead: 0 - valid uImage. -1 - uImage is corrupted. 1 - image is not a uImage. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-10x86: Include kexec-mb2-x86.c and multiboot2.h in distributionSimon Horman
Fixes: 22a2ed55132e ("x86: Support multiboot2 images") Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-03x86: re-order includes to avoid duplicate struct e820entrySimon Horman
xenctrl.h defines struct e820entry as: if defined(__i386__) || defined(__x86_64__) ... #define E820_RAM 1 ... struct e820entry { uint64_t addr; uint64_t size; uint32_t type; } __attribute__((packed)); ... #endif $ dpkg-query -S /usr/include/xenctrl.h libxen-dev:amd64: /usr/include/xenctrl.h $ dpkg-query -W libxen-dev:amd64 libxen-dev:amd64 4.8.5+shim4.10.2+xsa282-1+deb9u11 ./include/x86/x86-linux.h defines struct e820entry as: #ifndef E820_RAM struct e820entry { uint64_t addr; /* start of memory segment */ uint64_t size; /* size of memory segment */ uint32_t type; /* type of memory segment */ #define E820_RAM 1 ... } __attribute__((packed)); #endif Since cedeee0a3007 ("x86: Introduce helpers for getting RSDP address") ./kexec/arch/i386/kexec-x86-common.c includes +#include "x86-linux-setup.h" #include "../../kexec-xen.h" When xenctrl.h is present the above results in: $ gcc ... In file included from kexec/arch/i386/../../kexec-xen.h:5:0, from kexec/arch/i386/kexec-x86-common.c:43: /usr/include/xenctrl.h:1271:8: error: redefinition of 'struct e820entry' struct e820entry { ^~~~~~~~~ In file included from kexec/arch/i386/x86-linux-setup.h:3:0, from kexec/arch/i386/kexec-x86-common.c:42: ./include/x86/x86-linux.h:16:8: note: originally defined here struct e820entry { ^~~~~~~~~ ... $ gcc --version | head -1 gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 To militate this this problem re-order the includes so that x86-linux.h is included after xenctrl.h and thus struct e820entry will only be defined once due to it being devined conditionally in x86-linux.h. In practice the definitions are the same so it should not matter which is chosen. It also seems rather unpleasent to me to need to play with include ordering. Perhaps a better solution in the longer term would be to rename the local definition of struct e820entry. Fixes: cedeee0a3007 ("x86: Introduce helpers for getting RSDP address") Signed-off-by: Simon Horman <horms@verge.net.au>
2019-07-03x86: Support multiboot2 imagesVarad Gautam
Add a new type `multiboot2-x86` that allows loading multiboot2 [1] images within the relocation range specified in the image header. The image is always placed at the lowest available address, regardless of the preference information. [1] https://www.gnu.org/software/grub/manual/multiboot2/multiboot.html Signed-off-by: Varad Gautam <vrd@amazon.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31crashdump/x86: Use new introduce helper for getting RSDPKairui Song
Use the new introduce helper for getting RSDP, this ensures RSDP is always accessible and avoid code duplication. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31x86: Always try to fill acpi_rsdp_addr in boot paramsKairui Song
Since kernel commit e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP address from boot params if available"), kernel accept an acpi_rsdp_addr param in boot_params. So fill in this parameter unconditionally, ensure second kernel always get the right RSDP address consistently, and boot well on EFI system even with EFI service disabled. User no longer need to change the kernel cmdline to workaround the missing RSDP issue. For older version of kernels (Before 5.0), there won't be any change of behavior. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-31x86: Introduce helpers for getting RSDP addressKairui Song
On x86 RSDP is fundamental for booting the machine. When second kernel is incapable of parsing the RSDP address (eg. kexec next kernel on an EFI system with EFI service disabled), kexec should prepare the RSDP address for second kernel. Introduce helpers for getting RSDP from multiple sources, including boot params and EFI firmware. For legacy BIOS interface, there is no better way to find the RSDP address rather than scanning the memory region and search for it, and this will always be done by the kernel as a fallback, so this is no need to try to get the RSDP address for that case. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-15x86: Find mounts by FS type, not nameNiklas Hambüchen
The name in mount invocations like mount -t debugfs debugfs /sys/kernel/debug is nothing but convention and cannot be relied upon. For example, https://www.kernel.org/doc/Documentation/filesystems/debugfs.txt recommends making the name "none" instead: mount -t debugfs none /sys/kernel/debug and many existing systems use mounts named "none" or otherwise. Using `mnt_type` instead of `mnt_fsname` allows kexec to work on such systems. This fixes another instance of `poweroff` not working on kexec'ed kernels because the lack of correctly matched mount results in EFI variables not being read and propagated. Signed-off-by: Niklas Hambüchen <mail@nh2.me> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-05-15x86: Check /proc/mounts before mtab for mountsNiklas Hambüchen
In many situations, especially on read-only file systems and initial ramdisks (intramfs/initrd), /etc/mtab does not exist. Before this commit, kexec would fail to read mounts on such systems in `find_mnt_by_fsname()`, such that `get_bootparam()` would not `boot_params/data`, which would then lead to e.g. `setup_efi_data()` not being called in `setup_efi_info()`. As a result, kexec'ed kernels would not obtain EFI data, subsequentially lack an `ACPI RSDP` entry, emitting: ACPI BIOS Error (bug): A valid RSDP was not found (20180810/tbxfroot-210) and thus fail to turn off the machine on poweroff, instead printing only: reboot: System halted This problem had to be worked around by passing `acpi_rsdp=` manually before. This commit obviates this workaround. See also: * https://github.com/coreos/bugs/issues/167#issuecomment-487320879 * http://lists.infradead.org/pipermail/kexec/2012-October/006924.html Signed-off-by: Niklas Hambüchen <mail@nh2.me> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-03-06x86: Introduce a new option --reuse-video-typeKairui Song
After commit 060eee58 "x86: use old screen_info if needed", kexec-tools will force use old screen_info and vga type if failed to determine current vga type. But it is not always a good idea. Currently kernel hanging is inspected on some hyper-v VMs after this commit, because hyperv_fb will mimic EFI (or VESA) VGA on first boot up, but after the real driver is loaded, it will switch to new mode and no longer compatible with EFI/VESA VGA. Keep setting orig_video_isVGA to EFI/VESA VGA flag will get wrong driver loaded and try to manipulate the framebuffer in a wrong way. We can't ensure this won't happen on other framebuffer drivers, But it's a helpful feature if the framebuffer drivers just work. So this patch introduce a --reuse-video-type options to let user decide if the old screen_info hould be used unconditional or not. Signed-off-by: Kairui Song <kasong@redhat.com> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-02-05arm64: wipe old initrd addresses when patching the DTBJean-Philippe Brucker
When copying the DTB from the current kernel, if the user didn't pass an initrd on the command-line, make sure that the new DTB doesn't contain initrd properties with stale addresses. Otherwise the next kernel will try to unpack the initramfs from a location that contains junk, since the initial initrd is long gone: [ 49.370026] Initramfs unpacking failed: junk in compressed archive This issue used to be hidden by a successful recovery, but since commit ff1522bb7d98 ("initramfs: cleanup incomplete rootfs") in Linux, the kernel removes the default /root mountpoint after failing to load an initramfs, and cannot mount the rootfs passed on the command-line anymore. Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28x86: Handle 64bit framebuffer memory address properlyKairui Song
In a EFI system, the frame buffer address is 64bit, so currently if the address is beyound 4G, kexec will set wrong address due to truncate. Linux kernel commit ae2ee627dc87 ('efifb: Add support for 64-bit frame buffer addresses') added support for 64bit frame buffer address, an 'ext_lfb_base' field is added as the upper 32-bits of the frame buffer, and introduced a new capability flag 'VIDEO_TYPE_CAPABILITY_64BIT_BASE' to indicate if the extend field is used. This patch adopts this change, set proper extent address and capability flag when the address is beyound 4G. Signed-off-by: Kairui Song <kasong@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: pass framebuffer information when requestedFriedemann Gerold
When the kernel requests video information, pass it the framebuffer information in the multiboot header from the linux framebuffer ioctl's. With the arch specific --reset-vga or --consolve-vga options, purgatory will reset the framebuffer so pass information for standard ega text mode. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: pass ACPI reserved memory information in memory mapFriedemann Gerold
Use the appropriate types for ACPI reclaim and ACPI NVS ranges in the multiboot memory map. This allows the kernel to locate ACPI tables on UEFI systems without having a explicit pointer to the RSD. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-28multiboot-x86: support for non-elf kernelsFriedemann Gerold
Add support for non-elf multiboot kernels (such as Plan 9) by handling the MULTIBOOT_AOUT_KLUDGE bit. When the bit is clear then we are dealing with an ELF file and probe for ELF as before with elf_x86_probe(). When the bit is set then load_addr, load_end_addr, header_addr and entry_addr from the multiboot header are used load the memory image. Signed-off-by: Friedemann Gerold <cinap_lenrek@felloff.net> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-15arm64: add kexec_file_load supportAKASHI Takahiro
With this patch, kexec_file_load() system call is supported. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Reviewed-by: Bhupesh Sharma <bhsharma@redhat.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-09arm64: Add support to read PHYS_OFFSET from 'kcore' - pt_note or pt_load (if ↵Bhupesh Sharma
available) On certain arm64 platforms, it has been noticed that due to a hole at the start of physical ram exposed to kernel (i.e. it doesn't start from address 0), the kernel still calculates the 'memstart_addr' kernel variable as 0. Whereas the SYSTEM_RAM or IOMEM_RESERVED range in '/proc/iomem' would carry a first entry whose start address is non-zero (as the physical ram exposed to the kernel starts from a non-zero address). In such cases, if we rely on '/proc/iomem' entries to calculate the phys_offset, then we will have mismatch between the user-space and kernel space 'PHYS_OFFSET' value. The present 'kexec-tools' code does the same in 'get_memory_ranges_iomem_cb()' function when it makes a call to 'set_phys_offset()'. This can cause the vmcore generated via 'kexec-tools' to miss the last few bytes as the first '/proc/iomem' starts from a non-zero address. Please see [0] for the original bug-report from Yanjiang Jin. The same can be fixed in the following manner: 1. For newer kernel (>= 4.19, with commit 23c85094fe1895caefdd ["proc/kcore: add vmcoreinfo note to /proc/kcore"] available), 'kcore' contains a new PT_NOTE which carries the VMCOREINFO information. If the same is available, one should prefer the same to retrieve 'PHYS_OFFSET' value exported by the kernel as this is now the standard interface exposed by kernel for sharing machine specific details with the user-land as per the arm64 kernel maintainers (see [1]) . 2. For older kernels, we can try and determine the PHYS_OFFSET value from PT_LOAD segments inside 'kcore' via some jugglery of the correct virtual and physical address combinations. As a fallback, we still support getting the PHYS_OFFSET values from '/proc/iomem', to maintain backward compatibility. Testing: ------- - Tested on my apm-mustang and qualcomm amberwing board with upstream kernel (4.20.0-rc7) for both KASLR and non-KASLR boot cases. References: ----------- [0] https://www.spinics.net/lists/kexec/msg20618.html [1] https://www.mail-archive.com/kexec@lists.infradead.org/msg20300.html Reported-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2019-01-09kexec/kexec-arm64.c: Add error handling check against return value of ↵Bhupesh Sharma
'set_bootargs()' This patch adds missing error handling check against the return value of 'set_bootargs()' in 'kexec-arm64.c' Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-10-29arm64: If 'getrandom' syscall fails, don't error out - just warn and proceed.Bhupesh Sharma
For calculating the random 'kaslr-seed' value to be passed to the secondary kernel (kexec or kdump), we invoke the 'getrandom' syscall inside 'setup_2nd_dtb()' function. Normally on most arm64 systems this syscall doesn't fail when the initrd scriptware (which arms kdump service) invokes the same. However, recently I noticed that on the 'hp-moonshot' arm64 boards, we have an issue with the newer kernels which causes the same to fail. As a result, the kdump service fails and we are not able to use the kdump infrastructure just after boot. As expected, once the random pool is sufficiently populated and we launch the kdump service arming scripts again (manually), then the kdump service is properly enabled. Lets handle the same, by not error'ing out if 'getrandom' syscall fails. Rather lets warn the user and proceed further by setting the 'kaslr-seed' value as 0 for the secondary kernel - which implies that it boots in a 'nokaslr' mode. Tested on my 'hp-moonshot' and 'qualcomm-amberwing' arm64 boards. Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-10-29x86: fix BAD_FREE in get_efi_runtime_map()Pingfan Liu
If the err_out label is reached, address of a stack variable is passed to free(). Fix it. Signed-off-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-10-02kdump: fix an error that can not parse the e820 reserved regionLianbo Jiang
When kexec-tools load the kernel and initramfs for kdump, kexec-tools will read /proc/iomem and recreate the e820 ranges for kdump kernel. But it fails to parse the e820 reserved region, because the memcmp() is case sensitive when comparing the string. In fact, it may be "Reserved" or "reserved" in the /proc/iomem, so we have to fix these cases. Signed-off-by: Lianbo Jiang <lijiang@redhat.com> Reviewed-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-08-24kexec: fix for "Unhandled rela relocation: R_X86_64_PLT32" errorChris Clayton
In response to a change in binutils, commit b21ebf2fb4c (x86: Treat R_X86_64_PLT32 as R_X86_64_PC32) was applied to the linux kernel during the 4.16 development cycle and has since been backported to earlier stable kernel series. The change results in the failure message in $SUBJECT when rebooting via kexec. Fix this by replicating the change in kexec. Signed-off-by: Chris Clayton <chris2553@googlemail.com> Acked-by: Baoquan He <bhe@redhat.com> Tested-by: Bhupesh Sharma <bhsharma@redhat.com> Acked-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-06-29arm64: error out if kernel command line is too longMunehisa Kamata
Currently, in arm64, kexec silently truncates kernel command line longer than COMMAND_LINE_SIZE - 1. Error out in that case as some other architectures already do that. The error message is copied from x86_64. Suggested-by: Tom Kirchner <tjk@amazon.com> Signed-off-by: Munehisa Kamata <kamatam@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-06-29arm64: increase command line size to 2048Munehisa Kamata
Otherwise, we can hit the current 512 chars limit before hitting the Linux kernel's one, where allows 2048 chars in arm64. Signed-off-by: Munehisa Kamata <kamatam@amazon.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-06-19arm64: Add support to supply 'kaslr-seed' to secondary kernelBhupesh Sharma
This patch adds the support to supply 'kaslr-seed' to secondary kernel, when we do a 'kexec warm reboot to another kernel' (although the behaviour remains the same for the 'kdump' case as well) on arm64 platforms using the 'kexec_load' invocation method. Lets consider the case where the primary kernel working on the arm64 platform supports kaslr (i.e 'CONFIG_RANDOMIZE_BASE' was set to y and we have a compliant EFI firmware which supports EFI_RNG_PROTOCOL and hence can pass a non-zero (valid) seed to the primary kernel). Now the primary kernel reads the 'kaslr-seed' and wipes it to 0 and uses the seed value to randomize for e.g. the module base address offset. In the case of 'kexec_load' (or even kdump for brevity), we rely on the user-space kexec-tools to pass an appropriate dtb to the secondary kernel and since 'kaslr-seed' is wiped to 0 by the primary kernel, the secondary will essentially work with *nokaslr* as 'kaslr-seed' is set to 0 when it is passed to the secondary kernel. This can be true even in case the secondary kernel had 'CONFIG_RANDOMIZE_BASE' and 'CONFIG_RANDOMIZE_MODULE_REGION_FULL' set to y. This patch addresses this issue by first checking if the device tree provided by the firmware to the kernel supports the 'kaslr-seed' property and verifies that it is really wiped to 0. If this condition is met, it fixes up the 'kaslr-seed' property by using the getrandom() syscall to get a suitable random number. I verified this patch on my Qualcomm arm64 board and here are some test results: 1. Ensure that the primary kernel is boot'ed with 'kaslr-seed' dts property and it is really wiped to 0: [root@qualcomm-amberwing]# dtc -I dtb -O dts /sys/firmware/fdt | grep -A 10 -i chosen chosen { kaslr-seed = <0x0 0x0>; ... } 2. Now issue 'kexec_load' to load the secondary kernel (let's assume that we are using the same kernel as the secondary kernel): # kexec -l /boot/vmlinuz-`uname -r` --initrd=/boot/initramfs-`uname -r`.img --reuse-cmdline -d 3. Issue 'kexec -e' to warm boot to the secondary: # kexec -e 4. Now after the secondary boots, confirm that the load address of the modules is randomized in every successive boot: [root@qualcomm-amberwing]# cat /proc/modules sunrpc 524288 1 - Live 0xffff0307db190000 vfat 262144 1 - Live 0xffff0307db110000 fat 262144 1 vfat, Live 0xffff0307db090000 crc32_ce 262144 0 - Live 0xffff0307d8c70000 ... Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-05-22kexec/s390: Add support for kexec_file_loadPhilipp Rudo
Since kernel 4.17-rc2 s390 supports the kexec_file_load system call. Add the new system call to kexec-tools and provide the -s (--kexec-file-syscall) option for s390 to support this new feature. Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-19kexec-elf-rel-ppc64: Fix cast from pointer warningGeoff Levand
Fixes warnings like these when building kexec for powerpc (32 bit): kexec-elf-rel-ppc64.c: warning: cast from pointer to integer of different size Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-19crashdump-ppc64: Fix integer truncation warningGeoff Levand
Fixes warnings like these when building kexec for powerpc (32 bit): crashdump-ppc64.h: warning: large integer implicitly truncated to unsigned type Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-04-19Merge branch 'master' of git://git.armlinux.org.uk/~rmk/kexec-toolsSimon Horman
2018-03-30kexec/ppc64: leverage kexec_file_load supportHari Bathini
PPC64 kernel now supports kexec_file_load system call. Leverage it by enabling that support here. Note that loading crash kernel with this system call is not yet supported in the kernel and trying to load one will fail with '-ENOTSUPP' error. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-03-20ARM: Include stack and malloc space in zImage sizeRussell King
Include the stack and malloc space in our calculation of the zImage size, both of which must be avoided when locating the dtb. Signed-off-by: Russell King <rmk@armlinux.org.uk>
2018-03-20ARM: add further debugRussell King
Add further debugging of the kernel size Signed-off-by: Russell King <rmk@armlinux.org.uk>
2018-03-20ARM: read kernel size from zImageRussell King
Signed-off-by: Russell King <rmk@armlinux.org.uk>
2018-02-22kexec/ppc64: add support to parse ibm, dynamic-memory-v2 propertyHari Bathini
Add support to parse the new 'ibm,dynamic-memory-v2' property in the 'ibm,dynamic-reconfiguration-memory' node. This replaces the old 'ibm,dynamic-memory' property and is enabled in the kernel with a patch series that starts with commit 0c38ed6f6f0b ("powerpc/pseries: Enable support of ibm,dynamic-memory-v2"). All LMBs that share the same flags and are adjacent are grouped together in the newer version of the property making it compact to represent larger memory configurations. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-29x86: use old screen_info if neededDave Young
With modern drm/kms graphic driver kexec-tools does not setup screen_info correctly so one will only see screen output after those drm drivers reinitializing after rebooting. Copying the old screen info from original boot_params will help during my test, although it could not work for some potential cases, but it is not worse than before. This has been used in the kernel kexec_file_load. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-24kexec-tools: Perform run-time linking of libxenctrl.soEric DeVolder
When kexec is utilized in a Xen environment, it has an explicit run-time dependency on libxenctrl.so. This dependency occurs during the configure stage and when building kexec-tools. When kexec is utilized in a non-Xen environment (either bare metal or KVM), the configure and build of kexec-tools omits any reference to libxenctrl.so. Thus today it is not currently possible to configure and build a *single* kexec that will work in *both* Xen and non-Xen environments, unless the libxenctrl.so is *always* present. For example, a kexec configured for Xen in a Xen environment: # ldd build/sbin/kexec linux-vdso.so.1 => (0x00007ffdeba5c000) libxenctrl.so.4.4 => /usr/lib64/libxenctrl.so.4.4 (0x00000038d8000000) libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000) libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000) libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000) /lib64/ld-linux-x86-64.so.2 (0x000055e9f8c6c000) # build/sbin/kexec -v kexec-tools 2.0.16 However, the *same* kexec executable fails in a non-Xen environment: # copy xen kexec to . # ldd ./kexec linux-vdso.so.1 => (0x00007fffa9da7000) libxenctrl.so.4.4 => not found liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x0000003014e00000) libz.so.1 => /lib64/libz.so.1 (0x000000300ea00000) libc.so.6 => /lib64/libc.so.6 (0x000000300de00000) libpthread.so.0 => /lib64/libpthread.so.0 (0x000000300e200000) /lib64/ld-linux-x86-64.so.2 (0x0000558cc786c000) # ./kexec -v ./kexec: error while loading shared libraries: libxenctrl.so.4.4: cannot open shared object file: No such file or directory At Oracle we "workaround" this by having two kexec-tools packages, one for Xen and another for non-Xen environments. At Oracle, the desire is to offer a single kexec-tools package that works in either environment. To achieve this, kexec-tools would either have to ship with libxenctrl.so (which we have deemed as unacceptable), or we can make kexec perform run-time linking against libxenctrl.so. This patch is one possible way to alleviate the explicit run-time dependency on libxenctrl.so. This implementation utilizes a set of macros to wrap calls into libxenctrl.so so that the library can instead be dlopen() and obtain the function via dlsym() and then make the call. The advantage of this implementation is that it requires few changes to the existing kexec-tools code. The dis- advantage is that it uses macros to remap libxenctrl functions and do work under the hood. Another possible implementation worth considering is the approach taken by libvmi. Reference the following file: https://github.com/libvmi/libvmi/blob/master/libvmi/driver/xen/libxc_wrapper.h The libxc_wrapper_t structure definition that starts at line ~33 has members that are function pointers into libxenctrl.so. This structure is populated once and then later referenced/dereferenced by the callers of libxenctrl.so members. The advantage of this implementation is it is more explicit in managing the use of libxenctrl.so and its versions, but the disadvantage is it would require touching more of the kexec-tools code. The following is a list libxenctrl members utilized by kexec: Functions: xc_interface_open xc_kexec_get_range xc_interface_close xc_kexec_get_range xc_interface_open xc_get_max_cpus xc_kexec_get_range xc_version xc_kexec_exec xc_kexec_status xc_kexec_unload xc_hypercall_buffer_array_create xc__hypercall_buffer_array_alloc xc_hypercall_buffer_array_destroy xc_kexec_load xc_get_machine_memory_map Data: xc__hypercall_buffer_HYPERCALL_BUFFER_NULL These were identified by configuring and building kexec-tools with Xen support, but omitting the -lxenctrl from the LDFLAGS in the Makefile for an x86_64 build. The above libxenctrl members were referenced via these source files. kexec/crashdump-xen.c kexec/kexec-xen.c kexec/arch/i386/kexec-x86-common.c kexec/arch/i386/crashdump-x86.c This patch provides a wrapper around the calls to the above functions in libxenctrl.so. Every libxenctrl call must pass a xc_interface which it obtains from xc_interface_open(). So the existing code is already structured in a manner that facilitates graceful dlopen()'ing of the libxenctrl.so and the subsequent dlsym() of the required member. The patch creates a wrapper function around xc_interface_open() and xc_interface_close() to perform the dlopen() and dlclose(). For the remaining xc_ functions, this patch defines a macro of the same name which performs the dlsym() and then invokes the function. See the __xc_call() macro for details. There was one data item in libxenctrl.so that presented a unique problem, HYPERCALL_BUFFER_NULL. It was only utilized once, as set_xen_guest_handle(xen_segs[s].buf.h, HYPERCALL_BUFFER_NULL); I tried a variety of techniques but could not find a general macro-type solution without modifying xenctrl.h. So the solution was to declare a local HYPERCALL_BUFFER_NULL, and this appears to work. I admit I am not familiar with libxenctrl to state if this is a satisfactory workaround, so feedback here welcome. I can state that this allows kexec to load/unload/kexec on Xen and non-Xen environments that I've tested without issue. With this patch applied, kexec-tools can be built with Xen support and yet there is no explicit run-time dependency on libxenctrl.so. Thus it can also be deployed in non-Xen environments where libxenctrl.so is not installed. # ldd build/sbin/kexec linux-vdso.so.1 => (0x00007fff7dbcd000) liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x00000038d9000000) libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000) libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000) libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000) /lib64/ld-linux-x86-64.so.2 (0x0000562dc0c14000) # build/sbin/kexec -v kexec-tools 2.0.16 This feature/ability is enabled with the following: ./configure --with-xen=dl The previous --with-xen=no and --with-xen=yes still work as before. Not specifying a --with-xen still defaults to --with-xen=yes. As I've introduced a new build and run-time mode, I've done an extensive matrix of both build-time and run-time checks of kexec with this patch applied. The set of build-time scenarios are: 1: configure --with-xen=no and Xen support NOT present 2: configure --with-xen=no and Xen support IS present 3: configure --with-xen=yes and Xen support NOT present 4: configure --with-xen=yes and Xen support IS present 5: configure --with-xen=dl and Xen support NOT present 6: configure --with-xen=dl and Xen support IS present Xen support present requires that configure can find both xenctrl.h and libxenctrl.so. Then for each of the six scenarios above, the corresponding kexec binary was tested on a Xen system (Oracle's OVS dom0) and a non-Xen system (Oracle Linux). There are two build-time checks: did kexec build, and did it contain libxenctrl.so? The presence of libxenctrl.so in kexec was checked via ldd. The results were: Scenario | Build | libxenctrl.so | Result 1 | pass | no | pass - see Note 1 2 | pass | no | pass - see Note 1 3 | pass | no | pass - see Note 2 4 | pass | yes | pass - see Note 3 5 | pass | no | pass - see Note 2 6 | pass | no | pass - see Note 4 Note 1: This passes since due to --with-xen=no, there will be no Xen support in kexec and therefore no libxenctrl.so a in the kexec. Note 2: This passes since while --with-xen=yes, the configure displays a message indicating that Xen support is disabled, and allows kexec to build (this is the same behavior as prior to this patch). And since Xen support is disabled, there is no libxenctrl.so in the kexec. Note 3: This passes since with --with-xen=yes and configure locating the xenctrl.h and libxenctrl.so, support for Xen was built into kexec. Ldd shows an explicit dependency on the library. Note 4: This passes since with --with-xen=dl and configure locating the xenctrl.h and libxencrl.so, support for Xen was built into kexec. However, this uses the new technique introduced by this patch and, as a result, ldd shows that the libxenctrl.so is not a explicit run-time dependency for kexec (rather libdl.so is now an explicit dependency). This is precisely the goal of this patch! The net effect is that there are now three "flavors" of a kexec binary (prior to this patch there were two): a) kexec with no support for Xen [scenarios 1, 2, 3, 5], b) kexec with support for Xen and libxenctrl.so as an explicit dependency [scenario 4], and c) kexec with support for Xen and libxenctrl.so is NOT an explicit dependency [scenario 6]. The run-time checks are to take each of the six scenarios above and run the corresponding kexec binary on both a Xen system and a non-Xen system. The test for each kexec scenario was: % service kdump stop % vi /etc/init.d/kdump change KEXEC= to /sbin/kexec-[123456] % service kdump start # If not FAILED, then below % service kdump status Kdump is operational % rm -fr /var/crash/* % echo c > /proc/sysrq-trigger # after reboot verify vmcore generated % ls -al /var/crash/<tab> The results were: Scenario | Xen environment | non-Xen environment 1 | fail - see Note 5 | pass 2 | fail - see Note 5 | pass 3 | fail - see Note 6 | pass 4 | pass | fail - see Note 7 5 | fail - see Note 6 | pass 6 | pass | pass Note 5: Due to --with-xen=no, kexec lacks support for Xen and will fail in the Xen environment. This behavior is the same as prior to this patch. Note 6: Due to the missing xenctrl.h and libxenctrl.so, kexec was built without support for Xen, and thus will fail in the Xen environment. This behavior is the same as prior to this patch. Note 7: This kexec has the explicit dependency on libxenctrl.so which prevents it from running in a non-Xen environment. This is expected as this is the original issue for which this patch is intended to address. Note that for scenarios 1, 2, 3 and 5 kexec lacks support for Xen, thus these versions are expected to "fail" in a Xen environment. On the flip side, since a non-Xen environment does not need libxenctrl.so, all but scenario 4 are expected to "pass" in a non-Xen environment. The results match these expectations! And, of course, importantly with this patch applied, it did not have an adverse impact on kexec build or run-time. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-11-01ARM: read kernel size from zImageRussell King
Read the new extension data which tells the boot agent about the requirements for booting the kernel image, such as how much RAM will be consumed by the kernel through decompression and booting. This is necessary to control the placement of the DTB and compressed RAM disk to avoid these objects being corrupted. Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Russell King <rmk@armlinux.org.uk> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-11-01ARM: cleanup initrd and dtb handingRussell King
There is no difference in the way the initrd is handled between an ATAG-based kernel and a DTB-based kernel. Therefore, this should be handled identically in both cases. Rearrange the code to achieve this. Signed-off-by: Russell King <rmk@armlinux.org.uk> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-10-18kexec-tools: mips: Use proper page_offset for OCTEON CPUs.David Daney
The OCTEON family of MIPS64 CPUs uses a PAGE_OFFSET of 0x8000000000000000ULL, which is differs from other CPUs. Scan /proc/cpuinfo to see if the current system is "Octeon", if so, patch the page_offset so that usable kdump core files are produced. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-10-18kexec-tools: mips: Merge adjacent memory ranges.David Daney
Some kernel versions running on MIPS split the System RAM memory regions reported in /proc/iomem. This may cause loading of the kexec kernel to fail if it crosses one of the splits. Fix by merging adjacent memory ranges that have the same type. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-10-16kexec-tools: mips: Try to include bss in kernel vmcore file.David Daney
The kernel message buffers, as well as a lot of other useful data reside in the bss section. Without this vmcore-dmesg cannot work, and debugging with a core dump is much more difficult. Try to add the /proc/iomem "Kernel bss" section to vmcore. If it is not found, just do what we used to do and use "Kernel data" instead. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-10-16kexec-tools: mips: Don't set lowmem_limit to 2G for 64-bit systems.David Daney
The 64-bit MIPS architecture doesn't have the same 2G limit the 32-bit version has. Set MAXMEM and lowmem_limit to 0 for 64-bit MIPS so that memory above 2G is usable in the kdump core files. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-08-30kexec-tools: ppc64: fix leak while checking for coherent device memoryHari Bathini
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-08-28kexec-tools: ppc64: avoid adding coherent memory regions to crash memory rangesHari Bathini
Accelerator devices like GPU and FPGA cards contain onboard memory. This onboard memory is represented as a memory only NUMA node, integrating it with core memory subsystem. Since, the link through which these devices are integrated to core memory goes down after a system crash and they are meant for user workloads, avoid adding coherent device memory regions to crash memory ranges. Without this change, makedumpfile tool tries to save unaccessible coherent device memory regions, crashing the system. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Tested-by: Pingfan Liu <piliu@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-08-10kexec-tools: powerpc: fix command line overflow errorHari Bathini
Since kernel commit a5980d064fe2 ("powerpc: Bump COMMAND_LINE_SIZE to 2048"), powerpc bumped command line size to 2048 but the size used here is still the default value of 512. Bump it to 2048 to fix command line overflow errors observed when command line length is above 512 bytes. Also, get rid of the multiple definitions of COMMAND_LINE_SIZE macro in ppc architecture. Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-08-04kexec-tools: ppc64: fix how RMA top is deducedHari Bathini
Hang was observed, in purgatory, on a machine configured with single LPAR. This was because one of the segments was loaded outside the actual Real Memory Area (RMA) due to wrongly deduced RMA top value. Currently, top of real memory area, which is crucial for loading kexec/kdump kernel, is obtained by iterating through mem nodes and setting its value based on the base and size values of the last mem node in the iteration. That can't always be correct as the order of iteration may not be same and RMA base & size are always based on the first memory property. Fix this by setting RMA top value based on the base and size values of the memory node that has the smallest base value (first memory property) among all the memory nodes. Also, correct the misnomers rmo_base and rmo_top to rma_base and rma_top respectively. While how RMA top is deduced was broken for sometime, the issue may not have been seen so far, for couple of possible reasons: 1. Only one mem node was available. 2. First memory property has been the last node in iteration when multiple mem nodes were present. Fixes: 02f4088ffded ("kexec fix ppc64 device-tree mem node") Reported-by: Ankit Kumar <ankit@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Geoff Levand <geoff@infradead.org> Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-05-22arm64: kdump: Add support for binary image filesPratyush Anand
This patch adds support to use binary image ie arch/arm64/boot/Image with kdump. Signed-off-by: Pratyush Anand <panand@redhat.com> [takahiro.akashi@linaro.org: a bit reworked] Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-05-22arm64: kdump: add DT properties to crash dump kernel's dtbAKASHI Takahiro
We pass the following properties to crash dump kernel: linux,elfcorehdr: elf core header segment, same as "elfcorehdr=" kernel parameter on other archs linux,usable-memory-range: usable memory reserved for crash dump kernel Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-05-22arm64: kdump: set up other segmentsAKASHI Takahiro
We make sure that all the other segments, initrd and device-tree blob, also be loaded into the reserved memory of crash dump kernel. Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>