summaryrefslogtreecommitdiff
path: root/kexec/arch/i386
AgeCommit message (Collapse)Author
2018-01-29x86: use old screen_info if neededDave Young
With modern drm/kms graphic driver kexec-tools does not setup screen_info correctly so one will only see screen output after those drm drivers reinitializing after rebooting. Copying the old screen info from original boot_params will help during my test, although it could not work for some potential cases, but it is not worse than before. This has been used in the kernel kexec_file_load. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2018-01-24kexec-tools: Perform run-time linking of libxenctrl.soEric DeVolder
When kexec is utilized in a Xen environment, it has an explicit run-time dependency on libxenctrl.so. This dependency occurs during the configure stage and when building kexec-tools. When kexec is utilized in a non-Xen environment (either bare metal or KVM), the configure and build of kexec-tools omits any reference to libxenctrl.so. Thus today it is not currently possible to configure and build a *single* kexec that will work in *both* Xen and non-Xen environments, unless the libxenctrl.so is *always* present. For example, a kexec configured for Xen in a Xen environment: # ldd build/sbin/kexec linux-vdso.so.1 => (0x00007ffdeba5c000) libxenctrl.so.4.4 => /usr/lib64/libxenctrl.so.4.4 (0x00000038d8000000) libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000) libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000) libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000) /lib64/ld-linux-x86-64.so.2 (0x000055e9f8c6c000) # build/sbin/kexec -v kexec-tools 2.0.16 However, the *same* kexec executable fails in a non-Xen environment: # copy xen kexec to . # ldd ./kexec linux-vdso.so.1 => (0x00007fffa9da7000) libxenctrl.so.4.4 => not found liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x0000003014e00000) libz.so.1 => /lib64/libz.so.1 (0x000000300ea00000) libc.so.6 => /lib64/libc.so.6 (0x000000300de00000) libpthread.so.0 => /lib64/libpthread.so.0 (0x000000300e200000) /lib64/ld-linux-x86-64.so.2 (0x0000558cc786c000) # ./kexec -v ./kexec: error while loading shared libraries: libxenctrl.so.4.4: cannot open shared object file: No such file or directory At Oracle we "workaround" this by having two kexec-tools packages, one for Xen and another for non-Xen environments. At Oracle, the desire is to offer a single kexec-tools package that works in either environment. To achieve this, kexec-tools would either have to ship with libxenctrl.so (which we have deemed as unacceptable), or we can make kexec perform run-time linking against libxenctrl.so. This patch is one possible way to alleviate the explicit run-time dependency on libxenctrl.so. This implementation utilizes a set of macros to wrap calls into libxenctrl.so so that the library can instead be dlopen() and obtain the function via dlsym() and then make the call. The advantage of this implementation is that it requires few changes to the existing kexec-tools code. The dis- advantage is that it uses macros to remap libxenctrl functions and do work under the hood. Another possible implementation worth considering is the approach taken by libvmi. Reference the following file: https://github.com/libvmi/libvmi/blob/master/libvmi/driver/xen/libxc_wrapper.h The libxc_wrapper_t structure definition that starts at line ~33 has members that are function pointers into libxenctrl.so. This structure is populated once and then later referenced/dereferenced by the callers of libxenctrl.so members. The advantage of this implementation is it is more explicit in managing the use of libxenctrl.so and its versions, but the disadvantage is it would require touching more of the kexec-tools code. The following is a list libxenctrl members utilized by kexec: Functions: xc_interface_open xc_kexec_get_range xc_interface_close xc_kexec_get_range xc_interface_open xc_get_max_cpus xc_kexec_get_range xc_version xc_kexec_exec xc_kexec_status xc_kexec_unload xc_hypercall_buffer_array_create xc__hypercall_buffer_array_alloc xc_hypercall_buffer_array_destroy xc_kexec_load xc_get_machine_memory_map Data: xc__hypercall_buffer_HYPERCALL_BUFFER_NULL These were identified by configuring and building kexec-tools with Xen support, but omitting the -lxenctrl from the LDFLAGS in the Makefile for an x86_64 build. The above libxenctrl members were referenced via these source files. kexec/crashdump-xen.c kexec/kexec-xen.c kexec/arch/i386/kexec-x86-common.c kexec/arch/i386/crashdump-x86.c This patch provides a wrapper around the calls to the above functions in libxenctrl.so. Every libxenctrl call must pass a xc_interface which it obtains from xc_interface_open(). So the existing code is already structured in a manner that facilitates graceful dlopen()'ing of the libxenctrl.so and the subsequent dlsym() of the required member. The patch creates a wrapper function around xc_interface_open() and xc_interface_close() to perform the dlopen() and dlclose(). For the remaining xc_ functions, this patch defines a macro of the same name which performs the dlsym() and then invokes the function. See the __xc_call() macro for details. There was one data item in libxenctrl.so that presented a unique problem, HYPERCALL_BUFFER_NULL. It was only utilized once, as set_xen_guest_handle(xen_segs[s].buf.h, HYPERCALL_BUFFER_NULL); I tried a variety of techniques but could not find a general macro-type solution without modifying xenctrl.h. So the solution was to declare a local HYPERCALL_BUFFER_NULL, and this appears to work. I admit I am not familiar with libxenctrl to state if this is a satisfactory workaround, so feedback here welcome. I can state that this allows kexec to load/unload/kexec on Xen and non-Xen environments that I've tested without issue. With this patch applied, kexec-tools can be built with Xen support and yet there is no explicit run-time dependency on libxenctrl.so. Thus it can also be deployed in non-Xen environments where libxenctrl.so is not installed. # ldd build/sbin/kexec linux-vdso.so.1 => (0x00007fff7dbcd000) liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x00000038d9000000) libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000) libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000) libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000) /lib64/ld-linux-x86-64.so.2 (0x0000562dc0c14000) # build/sbin/kexec -v kexec-tools 2.0.16 This feature/ability is enabled with the following: ./configure --with-xen=dl The previous --with-xen=no and --with-xen=yes still work as before. Not specifying a --with-xen still defaults to --with-xen=yes. As I've introduced a new build and run-time mode, I've done an extensive matrix of both build-time and run-time checks of kexec with this patch applied. The set of build-time scenarios are: 1: configure --with-xen=no and Xen support NOT present 2: configure --with-xen=no and Xen support IS present 3: configure --with-xen=yes and Xen support NOT present 4: configure --with-xen=yes and Xen support IS present 5: configure --with-xen=dl and Xen support NOT present 6: configure --with-xen=dl and Xen support IS present Xen support present requires that configure can find both xenctrl.h and libxenctrl.so. Then for each of the six scenarios above, the corresponding kexec binary was tested on a Xen system (Oracle's OVS dom0) and a non-Xen system (Oracle Linux). There are two build-time checks: did kexec build, and did it contain libxenctrl.so? The presence of libxenctrl.so in kexec was checked via ldd. The results were: Scenario | Build | libxenctrl.so | Result 1 | pass | no | pass - see Note 1 2 | pass | no | pass - see Note 1 3 | pass | no | pass - see Note 2 4 | pass | yes | pass - see Note 3 5 | pass | no | pass - see Note 2 6 | pass | no | pass - see Note 4 Note 1: This passes since due to --with-xen=no, there will be no Xen support in kexec and therefore no libxenctrl.so a in the kexec. Note 2: This passes since while --with-xen=yes, the configure displays a message indicating that Xen support is disabled, and allows kexec to build (this is the same behavior as prior to this patch). And since Xen support is disabled, there is no libxenctrl.so in the kexec. Note 3: This passes since with --with-xen=yes and configure locating the xenctrl.h and libxenctrl.so, support for Xen was built into kexec. Ldd shows an explicit dependency on the library. Note 4: This passes since with --with-xen=dl and configure locating the xenctrl.h and libxencrl.so, support for Xen was built into kexec. However, this uses the new technique introduced by this patch and, as a result, ldd shows that the libxenctrl.so is not a explicit run-time dependency for kexec (rather libdl.so is now an explicit dependency). This is precisely the goal of this patch! The net effect is that there are now three "flavors" of a kexec binary (prior to this patch there were two): a) kexec with no support for Xen [scenarios 1, 2, 3, 5], b) kexec with support for Xen and libxenctrl.so as an explicit dependency [scenario 4], and c) kexec with support for Xen and libxenctrl.so is NOT an explicit dependency [scenario 6]. The run-time checks are to take each of the six scenarios above and run the corresponding kexec binary on both a Xen system and a non-Xen system. The test for each kexec scenario was: % service kdump stop % vi /etc/init.d/kdump change KEXEC= to /sbin/kexec-[123456] % service kdump start # If not FAILED, then below % service kdump status Kdump is operational % rm -fr /var/crash/* % echo c > /proc/sysrq-trigger # after reboot verify vmcore generated % ls -al /var/crash/<tab> The results were: Scenario | Xen environment | non-Xen environment 1 | fail - see Note 5 | pass 2 | fail - see Note 5 | pass 3 | fail - see Note 6 | pass 4 | pass | fail - see Note 7 5 | fail - see Note 6 | pass 6 | pass | pass Note 5: Due to --with-xen=no, kexec lacks support for Xen and will fail in the Xen environment. This behavior is the same as prior to this patch. Note 6: Due to the missing xenctrl.h and libxenctrl.so, kexec was built without support for Xen, and thus will fail in the Xen environment. This behavior is the same as prior to this patch. Note 7: This kexec has the explicit dependency on libxenctrl.so which prevents it from running in a non-Xen environment. This is expected as this is the original issue for which this patch is intended to address. Note that for scenarios 1, 2, 3 and 5 kexec lacks support for Xen, thus these versions are expected to "fail" in a Xen environment. On the flip side, since a non-Xen environment does not need libxenctrl.so, all but scenario 4 are expected to "pass" in a non-Xen environment. The results match these expectations! And, of course, importantly with this patch applied, it did not have an adverse impact on kexec build or run-time. Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-05-22kexec: generalize and rename get_kernel_stext_sym()Pratyush Anand
get_kernel_stext_sym() has been defined for both arm and i386. Other architecture might need some other kernel symbol address. Therefore rewrite this function as generic function to get any kernel symbol address. More over, kallsyms is not arch specific representation, therefore have common function for all arches. Signed-off-by: Pratyush Anand <panand@redhat.com> [created symbols.c] Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Tested-by: David Woodhouse <dwmw@amazon.co.uk> Tested-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-24x86: Support large number of memory rangesXunlei Pang
We got a problem on one SGI 64TB machine, the current kexec-tools failed to work due to the insufficient ranges(MAX_MEMORY_RANGES) allowed which is defined as 1024(less than the ranges on the machine). The kcore header is insufficient due to the same reason as well. To solve this, this patch simply doubles "MAX_MEMORY_RANGES" and "KCORE_ELF_HEADERS_SIZE". Signed-off-by: Xunlei Pang <xlpang@redhat.com> Tested-by: Frank Ramsay <frank.ramsay@hpe.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-14x86/x86_64: Fix format warning with die()Pratyush Anand
Fedora koji uses gcc version 7.0.1-0.12.fc27, and it generates a build warning kexec/arch/i386/kexec-elf-x86.c:299:3: error: format not a string literal and no format arguments [-Werror=format-security] die(error_msg); ^~~ cc1: some warnings being treated as errors error_msg can have a format specifier as well in string. In such cases, if there is no other arguments for the format variable then code will try to access a non existing argument. Therefore, use 1st argument as format specifier for string print and pass error_msg as the string to be printed. While doing that,also use const qualifier before "char *error_msg". Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-14Don't use %L width specifier with integer valuesPhilip Prindeville
MUSL doesn't support %L except for floating-point arguments; therefore, %ll must be used instead with integer arguments. Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2017-03-10Only print debug message when failed to serach for kernel symbol from ↵Baoquan He
/proc/kallsyms Kernel symbol page_offset_base could be unavailable when mm KASLR code is not compiled in kernel. It's inappropriate to print out error message when failed to search for page_offset_base from /proc/kallsyms. Seems now there is not a way to find out if mm KASLR is compiled in or not. An alternative approach is only printing out debug message in get_kernel_sym if failed to search a expected kernel symbol. Do it in this patch, a simple fix. Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Pratyush Anand <panand@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-12-08kexec-tools/x86: get_kernel_vaddr_and_size off-by-one fixDave Young
I got below error while tesing kexec -p: "Can't find kernel text map area from kcore" The case is the pt_load start addr was same as stext_sym. The checking code should really be saddr <= stext_sym so that the right pt_load area includes stext_sym can be matched. This was not reported by people previously because it will fail over to use hardcode X86_64__START_KERNEL_map to match the pt_load areas again in later code and it sometimes succeeds because of kernel address randomization. With this change according to my test stext_sym checking can garantee falling into right pt_load area if we get correct stext_sym. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-10-07multiboot: Use the "reserved" type for non-ram zonesSylvain Munaut
Seems that Xen actually checks for some zones to be 'reserved' and complains if they are not. This also matches what the bios uses at boot. Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-10-07multiboot: Fix length computation for the memory zonesSylvain Munaut
Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-09-29kexec/arch/i386: Add support for KASLR memory randomizationThomas Garnier
Multiple changes were made on KASLR (right now in linux-next). One of them is randomizing the virtual address of the physical mapping, vmalloc and vmemmap memory sections. It breaks kdump ability to read physical memory. This change identifies if KASLR memories randomization is used by checking if the page_offset_base variable exists. It search for the correct PAGE_OFFSET value by looking at the loaded memory section and find the lowest aligned on PUD (the randomization level). Related commits on linux-next: - 0483e1fa6e09d4948272680f691dccb1edb9677f: Base for randomization - 021182e52fe01c1f7b126f97fd6ba048dc4234fd: Enable for PAGE_OFFSET Signed-off-by: Thomas Garnier <thgarnie@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2016-03-24Pass struct mem_sym into machine_apply_elf_rel()Anton Blanchard
On PowerPC64 ABIv2 we need to look at the symbol to determine if it has a local entry point. Pass struct mem_sym into machine_apply_elf_rel() so we can. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-12-01Revert "crashdump/x86: Add option to get crash kernel region size"Simon Horman
This reverts commit 8a1aa35a1077b42bc2a2afb05d24b637e1edf2a1.
2015-11-30crashdump/x86: Add option to get crash kernel region sizeDaniel Kiper
Crash kernel region size is available via sysfs on Linux running on bare metal. However, this does not work when Linux runs as Xen dom0. In this case Xen crash kernel region size should be established using __HYPERVISOR_kexec_op hypercall (Linux kernel kexec functionality does not make a lot of sense in Xen dom0). Sadly hypercalls are not easily accessible using shell scripts or something like that. Potentially we can check "xl dmesg" output for crashkernel option but this is not nice. So, let's add this functionality, for Linux running on bare metal and as Xen dom0, to kexec-tools. This way kdump scripts may establish crash kernel region size in one way regardless of platform. All burden of platform detection lies on kexec-tools. Figure (and unit) displayed by this new kexec-tools functionality is the same as one taken from /sys/kernel/kexec_crash_size. This functionality is available on x86 platform only. If idea is acceptable then I can prepare patches for other platforms (if it is possible and make sense) and repost them as fully flagged patch series. Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-11-27x86: Make sure E820_PM[AE]M are defined if neededSimon Horman
It appears that (older?) revisions of xenctl.h define all of the E820_* values used in kexec-x86-common.c except E820_PMAM and E820_PMEM. This results in a build failure when building against libxenctl. Avoid this problem by providing local definitions of those values. It seems reasonable to do so in the kexec-x86-common.c as currently that is the only source file that uses the values in question. Fixes: 56a12abc1df1 ("kexec: fix mmap return code handling") Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Cc: Petr Tesarik <ptesarik@suse.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-10-06kexec-tools: fix build error with glibc 2.19 and earlier versionDave Young
kexec-tools build fails on my laptop with RHEL7.1 installed: gcc -g -O2 -fno-strict-aliasing -Wall -Wstrict-prototypes -I./include -I./util_lib/include -Iinclude/ -I./kexec/arch/x86_64/include -c -MD -o kexec/arch/i386/kexec-x86-common.o kexec/arch/i386/kexec-x86-common.c In file included from kexec/arch/i386/kexec-x86-common.c:36:0: kexec/arch/i386/../../kexec.h:19:2: error: #error BYTE_ORDER not defined #error BYTE_ORDER not defined ^ kexec/arch/i386/../../kexec.h:23:2: error: #error LITTLE_ENDIAN not defined #error LITTLE_ENDIAN not defined ^ kexec/arch/i386/../../kexec.h:27:2: error: #error BIG_ENDIAN not defined #error BIG_ENDIAN not defined ^ In file included from kexec/arch/i386/kexec-x86-common.c:37:0: kexec/arch/i386/../../kexec-syscall.h: In function ‘kexec_load’: kexec/arch/i386/../../kexec-syscall.h:74:2: warning: implicit declaration of function ‘syscall’ [-Wimplicit-function-declaration] return (long) syscall(__NR_kexec_load, entry, nr_segments, segments, flags); ^ make: *** [kexec/arch/i386/kexec-x86-common.o] Error 1 The build error was introduced by below commit: commit c9c21cc107dcc9b6053e39ead1069e03717513f9 Author: Baoquan He <bhe@redhat.com> Date: Thu Aug 6 19:10:55 2015 +0800 kexec: use _DEFAULT_SOURCE instead to remove compiling warning Now compiling will print warning like below. Change code as it suggested. # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" ^ See manpage: http://man7.org/linux/man-pages/man7/feature_test_macros.7.html _BSD_SOURCE has been deprecated since glibc 2.20, To allow code that requires _BSD_SOURCE in glibc 2.19 and earlier and _DEFAULT_SOURCE in glibc 2.20 and later to compile without warnings, define both _BSD_SOURCE and _DEFAULT_SOURCE. Thus fix it by adding back _BSD_SOURCE along with _DEFAULT_SOURCE. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-10-06Load crash kernel high on x86Petr Tesarik
There may be more than one crash kernel regions on x86. Currently, kexec-tools picks the largest one. If high reservation is smaller than low, it will try to load panic kernel low. However, the kexec syscall checks that target address is within crashk_res boundaries, so attempts to load crash kernel low result in -EADDRNOTAVAIL, and kexec prints out this error message: kexec_load failed: Cannot assign requested address Looking at the logic in arch/x86/kernel/setup.c, there are only two possible layouts: 1. crashk_res is below 4G, and there is only one region, 2. crashk_res is above 4G, and crashk_low_res is below 4G In either case, kexec-tools must pick the highest region. Changelog: * v3: rename function to get_crash_kernel_load_range * v2: remove unnecessary local variables Signed-off-by: Petr Tesarik <ptesarik@suse.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-09-02Add persistent memory supportBaoquan He
Kernel add E820_PRAM or E820_PMEM type for NVDIMM memory device. Now support them in kexec too. Reported-by: Toshi Kani <toshi.kani@hp.com> Tested-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-08-13kexec: use _DEFAULT_SOURCE instead to remove compiling warningBaoquan He
Now compiling will print warning like below. Change code as it suggested. # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" ^ Signed-off-by: Baoquan He <bhe@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-03-19x86: Remove unnecessary res variable from efi_map_addedSimon Horman
gcc 4.9.1 tells me this variable is set but unused Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-25kexec: iomem: fix callbacks params for sh and x86 archsRoman Pen
Commit 4362bfac changes params for kexec_iomem_for_each_line from 'unsigned long' to 'unsigned long long'. This patch fixes forgotten changes for sh and x86 archs. Bug causes incorrect parsing of memory ranges. Signed-off-by: Roman Pen <r.peniaev@gmail.com> Cc: kexec@lists.infradead.org Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-25i386: elf: Fix -Wunused-but-set-variable compilation warningAmeya Palande
kexec/arch/i386/kexec-elf-x86.c:97:6: warning: variable ‘modified_cmdline_len’ set but not used [-Wunused-but-set-variable] int modified_cmdline_len; Signed-off-by: Ameya Palande <2ameya@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-25i386: bzImage: Fix -Wunused-but-set-variable compilation warningAmeya Palande
kexec/arch/i386/kexec-bzImage.c:111:8: warning: variable ‘kernel_version’ set but not used [-Wunused-but-set-variable] Signed-off-by: Ameya Palande <2ameya@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-12multiboot: fix crash on NULL kernel command lineAmeya Palande
If "--command-line" option is not specified, then kexec segfaults while dereferencing NULL command line string pointer. While we are at it, also fix indentation and use '{' and '}' consistently. Signed-off-by: Ameya Palande <2ameya@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-05-30kdump: pass acpi_rsdp to 2nd kernel if kernel does not export efi runtime mapsDave Young
If kernel does not export efi runtime maps it means 1:1 mapping does not work or user explictly boot with efi=old_map. In this case efi setup code will failback to noefi boot, but for kdump case we still need pass extra acpi_rsdp cmdline. Thus adding a check in kdump path. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-05-28kexec-tools: usage text fixDave Young
There's one more '-' in arch_usage, thus s/pass--memmap-cmdline/pass-memmap-cmdline Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-05-28kexec-tools: add noefi arch cmdline optionDave Young
For kernel boot with efi=old_map or some quirked machines like SGI UV they use old ioremap instead of 1:1 mapping. But kexec efi support depends on the 1:1 mapping thus we need to switch to use the old way There's a kernel patch for exporting the efi flags so we can check the memory mapping method. But user may want to explictly disable efi boot for unknown reasons. So here add a new arch option '--noefi' for this case. Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-05-22x86, cleanup: remove cmdline_add_memmap_acpiWANG Chao
In kdump path, now we store all the 2nd kernel memory ranges in memmap_p. We could use just cmdline_add_memmap() to add all types of memory ranges to 2nd kernel cmdline. So clean up here, merge cmdline_add_memmap_acpi() into cmdline_add_memmap(). Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-05-11cleanup duplicate codeWANG Chao
I accidentally add one duplicate line. Now remove it. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-05-11condition check fixWANG Chao
In commit 91f5b9c ("kdump: pass e820 reserved region to 2nd kernel via e820 table or setup_data"), I made a wrong condition check. We should only add cmdline for a memory range if --pass-memmap-cmdline and the range type isn't RANGE_RESERVED. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-29kdump: pass e820 reserved region to 2nd kernel via e820 table or setup_dataWANG Chao
e820 reserved region could be useful in 2nd kernel. For example, PCI mmconf (extended mode) requires reserved region otherwise it falls back to legacy mode. The following log is from Cliff Wickman <cpw@sgi.com>: PCI: MMCONFIG for domain 1003 [bus 3f-3f] at [mem 0xff0ff00000-0xff0fffffff] (base 0xff0c000000) [Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x80cfffff] not reserved in ACPI motherboard resources PCI: not using MMCONFIG PCI devices on segment 1 (>0) can't fall back to legacy mode, thus kernel probing fails and device can't be found. We don't pass reserved region because these regions could be too much and eat up our very limited kernel command line resource in memmap=exactmap case. However now we use e820 map and setup_data to pass memory map to 2nd kernel and the number of reserved regions should not be a problem any more. Signed-off-by: WANG Chao <chaowang@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86: Pass memory range via E820 for kdumpWANG Chao
command line size is restricted by kernel, sometimes memmap=exactmap has too many memory ranges to pass to cmdline. And also memmap=exactmap and kASLR doesn't work together. A better approach, to pass the memory ranges for crash kernel to boot into, is filling the memory ranges into E820. boot_params only got 128 slots for E820 map to fit in, when the number of memory map exceeds 128, use setup_data to pass the rest as extended E820 memory map. kexec boot could also benefit from setup_data in case E820 memory map exceeds 128. Now this new approach becomes default instead of memmap=exactmap. saved_max_pfn users can specify --pass-memmap-cmdline to use the exactmap approach. Signed-off-by: WANG Chao <chaowang@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Reviewed-by: Linn Crosetto <linn@hp.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86: add --pass-memmap-cmdline optionWANG Chao
--pass-memmap-cmdline is used for pass memmap=exactmap cmdline for 2nd kernel. Later we will use this option to disable passing E820 memmap method but use the old exactmap method. Signed-off-by: WANG Chao <chaowang@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86, cleanup: kexec memory range .end to be inclusiveWANG Chao
Later kexec and kdump memory range will be mapped to E820entry. But currently kexec memory range .end field is exclusive while crash memory range is inclusive. Given the fact that the exported proc iomem and sysfs memmap are both inclusive, change kexec memory range .end to be inclusive. Later the unified memory range of both kexec and kdump can use the same E820 filling code. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86, cleanup: Store crash memory ranges kexec_infoWANG Chao
Add two new members to kexec_info structure: struct memory_range *crash_range int nr_crash_ranges; crash_range contains the memory ranges used to boot 2nd kernel. nr_crash_ranges contains the count of the crash memory ranges. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86, cleanup: use dbgprint_mem_range for memory range debuggingWANG Chao
Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86, cleanup: increase CRASH_MAX_MEMMAP_NR up to 1024WANG Chao
CRASH_MAX_MEMMAP_NR is used as the upper boundary of memmap_p. Originally memmap_p was used to store RANGE_RAM only. But now we changed it to store all the types of memory ranges for 2nd kernel, which includes RANGE_RAM, RANGE_ACPI, RANGE_ACPI_NVS (and RANGE_RESERVED in the future). Currently CRASH_MAX_MEMMAP_NR is defined (KEXEC_MAX_SEGMENTS + 2), which is not enough for memmap_p. It must be increased to a much higher value. I think 1024 is good enough for storing all memory ranges for 2nd kernel. So this patch increases CRASH_MAX_MEMMAP_NR to 1024. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86, cleanup: add other types of memory range for 2nd kernel boot to memmap_pWANG Chao
In load_crashdump_segments(), memmap_p[] is used to contain RANGE_RAM memory range for booting 2nd kernel. Now adding types of RANGE_ACPI and RANGE_ACPI_NVS to memmap_p, so later we can pass all the types of memory range to 2nd kernel. These all types of memory ranges are all stored in memmap_p for later reference. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86, cleanup: add_memmap() only do alignment check on RANGE_RAMWANG Chao
add_memmap() will also add memory range with type RANGE_ACPI and RANGE_ACPI_NVS (RANGE_RESERVED in the future) besides RANGE_RAM to memmap_p. Among these types of memory range, only RANGE_RAM needs to be aligned with certain alignment. RANGE_ACPI, RANGE_ACPI_NVS and RANGE_RESERVED doesn't have to be aligned. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-23x86, cleanup: add extra arguments to add_memmap() and delete_memmap()WANG Chao
This change will be used later: add_memmap(.., int *nr_memmap, .., int type); delete_memmap(.., int *nr_memmap, ..); memmap_p[] is statically allocated for a certain amount. It will be used later when mapping these memory maps to e820 map. It's convenient to keep track of the count of memmap_p (nr_memmap) in add_memmap and delete_memmap, because the counting has already been taken care of in these two functions. The original add_memmap() can only add memory range of RANGE_RAM type. For adding other types of memory range, add another argument for indicating the type. Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Dave Young <dyoung@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-14x86, cleanup: Add a funtion add_setup_data()WANG Chao
add_setup_data() is used to add an instance to the single linked list of setup_data structure. Signed-off-by: WANG Chao <chaowang@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-04-14x86, cleanup: fix indentWANG Chao
Signed-off-by: WANG Chao <chaowang@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-03-28x86, kaslr: add alternative way to locate kernel text mapping areaWANG Chao
When kASLR is enabled (CONFIG_RANDOMIZED_BASE=y), kernel text mapping base is randomized. The max base offset of such randomization is configured at compile time through CONFIG_RANDOMIZE_MAX_BASE_OFFSET (by default 1G). Currently kexec-tools is using hard code macro X86_64__START_KERNEL_map (0xffffffff80000000) and X86_64_KERNEL_TEXT_SIZE (512M) to determine kernel text mapping from kcore's PT_LOAD. With kASLR, the mapping is changed as the following: ffffffff80000000 - (ffffffff80000000+CONFIG_RANDOMIZE_BASE_MAX_OFFSET) As Vivek suggested, we can get _stext kernel symbol address from /proc/kallsyms, and search for kcore's PT_LOAD which contains _stext, and we can say that this area represents the kernel mapping area. Let's first use this way to find out kernel text mapping. If failed for whatever reason, fall back to use the old way. Suggested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: WANG Chao <chaowang@redhat.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-03-28i386: fix erroneous memory descriptor messageTony Jones
On non-EFI systems, efi_info section of boot_params is zero filled resulting in an erroneous message from kexec regarding "efi memory descriptor" version. Caused by commit: e1ffc9e9a0769e1f54185003102e9bec428b84e8 "Passing efi related data via setup_data" 0000700 0000 0000 0000 0000 0000 0000 0000 0000 0000720 0000 0000 0000 0000 0000 0000 0000 0000 0000740 efi memory descriptor version 0 is not supported! Signed-off-by: Tony Jones <tonyj@suse.de> Acked-by: Dave Young <dyoung@redhat.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-03-25kexec-tools: handle 64bit efi memmap address correctlyDave Young
In case using crashkernel=xM,high crashkernel memory will be allocated from top to down Thus the usable memory for kdump kernel could be bigger than 4G. The efi memmap value is two 32 bit values efi_memmap and efi_memmap_hi, previously I only passed the efi_memmap so for the high memory address there will be below kernel panic: [ 0.000000] efi: EFI v2.31 by American Megatrends [ 0.000000] efi: ACPI 2.0=0xdb752000 SMBIOS=0xdbab4b98 ACPI=0xdb752000 MPS=0xf4bd0 [ 0.000000] efi: mem00: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935) [ 0.000000] efi: mem01: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935) [ 0.000000] efi: mem02: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935) [ 0.000000] efi: mem03: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935) [ 0.000000] efi: mem04: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935) [ 0.000000] SMBIOS 2.7 present. [snip] [ 0.082451] BUG: unable to handle kernel paging request at ffffa3d0f0000000 [ 0.089467] IP: [<ffffffff810513d1>] native_set_pte+0x1/0x10 [ 0.095157] PGD 0 [ 0.097197] Oops: 0002 [#1] SMP [ 0.100466] Modules linked in: [ 0.103554] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc7 #157 [ 0.110001] Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS J61 v03.15 05/09/2013 [ 0.118697] task: ffffffff818e1460 ti: ffffffff818ce000 task.ti: ffffffff818ce000 [ 0.126181] RIP: 0010:[<ffffffff810513d1>] [<ffffffff810513d1>] native_set_pte+0x1/0x10 [ 0.134296] RSP: 0000:ffffffff818cfc80 EFLAGS: 00010287 [ 0.139609] RAX: 0000000000000000 RBX: ffffa3d0f0000000 RCX: 00003ffffffff000 [ 0.146744] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffffa3d0f0000000 [ 0.153879] RBP: ffffffff818cfcb8 R08: ffffea0010745d20 R09: 0000000000000000 [ 0.161013] R10: ffff88041f731fc0 R11: 000000000000001e R12: 0000000000200000 [ 0.168148] R13: 0000000000000000 R14: 0000000000400000 R15: ffff880000000008 [ 0.175288] FS: 0000000000000000(0000) GS:ffff88041f200000(0000) knlGS:0000000000000000 [ 0.183377] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.189125] CR2: ffffa3d0f0000000 CR3: 000000041e8da000 CR4: 00000000000406b0 [ 0.196264] Stack: [ 0.198283] ffffffff818cfcb8 ffffffff810561d7 ffff880000000008 0000000000400000 [ 0.205746] ffff880000001000 00000000000001ff ffff88041e8de000 ffffffff818cfd00 [ 0.213210] ffffffff8105644e 0000000000200000 0000000040000000 00000000ffffffff [ 0.220676] Call Trace: [ 0.223130] [<ffffffff810561d7>] ? unmap_pte_range+0x77/0x110 [ 0.228966] [<ffffffff8105644e>] unmap_pmd_range+0xde/0x210 [ 0.234630] [<ffffffff81056c6b>] __cpa_process_fault+0x48b/0x5e0 [ 0.240730] [<ffffffff81057276>] __change_page_attr_set_clr+0x4b6/0xb10 [ 0.247437] [<ffffffff810557c7>] ? __ioremap_caller+0x277/0x360 [ 0.253454] [<ffffffff810589f1>] kernel_map_pages_in_pgd+0x71/0xa0 [ 0.259736] [<ffffffff81a53361>] __map_region+0x45/0x63 [ 0.265051] [<ffffffff81a535cc>] efi_map_region_fixed+0xd/0xf [ 0.270886] [<ffffffff81a52f19>] efi_enter_virtual_mode+0x5a/0x3d9 [ 0.277162] [<ffffffff81a77516>] ? acpi_enable_subsystem+0x37/0x90 [ 0.283440] [<ffffffff81a36eb9>] start_kernel+0x386/0x41c [ 0.288931] [<ffffffff81a3693c>] ? repair_env_string+0x5c/0x5c [ 0.294852] [<ffffffff81a36120>] ? early_idt_handlers+0x120/0x120 [ 0.301035] [<ffffffff81a365ee>] x86_64_start_reservations+0x2a/0x2c [ 0.307479] [<ffffffff81a3672e>] x86_64_start_kernel+0x13e/0x14d [ 0.313572] Code: 66 2e 0f 1f 84 00 00 00 00 00 48 8b 46 18 55 48 89 e5 48 89 47 04 5d c3 66 90 55 48 89 e5 0f 01 f8 5d c3 0f 1f 8 [ 0.333545] RIP [<ffffffff810513d1>] native_set_pte+0x1/0x10 [ 0.339312] RSP <ffffffff818cfc80> [ 0.342807] CR2: ffffa3d0f0000000 [ 0.346141] ---[ end trace 86088f739725b8c6 ]--- [ 0.350760] Kernel panic - not syncing: Fatal exception Fix this by passing both efi_memmap and efi_memmap_hi to 2nd kernel. Reported-by: Linn Crosetto <linn@hp.com> Signed-off-by: Dave Young <dyoung@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-03-20cleanup: add dbgprint_mem_range functionWANG Chao
dbgprint_mem_range is used for printing the given memory range under debugging mode. Signed-off-by: WANG Chao <chaowang@redhat.com> Tested-by: Linn Crosetto <linn@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-02-06i386: fix redefinition error for e820entryTony Jones
At least on our systems, xenctrl.h defines (unguarded) struct e820entry Move the (guarded) definition in include/x86/x86-linux.h to below. Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-02-06i386: fix build failure (bzImage_support_efi_boot)Tony Jones
Commit 9c200a85de2245a850546fded96a1977b84ad24d referenced 'bzImage_support_efi_boot' without matching 32-bit definition. Signed-off-by: Tony Jones <tonyj@suse.de> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-01-21Passing efi related data via setup_dataDave Young
For supporting efi runtime, several efi physical addresses fw_vendor, runtime, config tables, smbios and the whole runtime mapping info need to be used in kexec kernel. Thus introduce setup_data struct for passing these data. collect the varialbes from /sys/firmware/efi/systab and /sys/firmware/efi/runtime-map Signed-off-by: Dave Young <dyoung@redhat.com> Tested-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-01-21Add efi_info in x86 setup headerDave Young
For supporting efi runtime on kexec kernel we need to fill the efi_info struct in setup_header. I just get the info in kernel exported boot_params data in debugfs. Signed-off-by: Dave Young <dyoung@redhat.com> Tested-by: Toshi Kani <toshi.kani@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au>