Age | Commit message (Collapse) | Author |
|
With modern drm/kms graphic driver kexec-tools does not setup screen_info
correctly so one will only see screen output after those drm drivers
reinitializing after rebooting. Copying the old screen info from original
boot_params will help during my test, although it could not work for some
potential cases, but it is not worse than before. This has been used in
the kernel kexec_file_load.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
When kexec is utilized in a Xen environment, it has an explicit
run-time dependency on libxenctrl.so. This dependency occurs
during the configure stage and when building kexec-tools.
When kexec is utilized in a non-Xen environment (either bare
metal or KVM), the configure and build of kexec-tools omits
any reference to libxenctrl.so.
Thus today it is not currently possible to configure and build
a *single* kexec that will work in *both* Xen and non-Xen
environments, unless the libxenctrl.so is *always* present.
For example, a kexec configured for Xen in a Xen environment:
# ldd build/sbin/kexec
linux-vdso.so.1 => (0x00007ffdeba5c000)
libxenctrl.so.4.4 => /usr/lib64/libxenctrl.so.4.4 (0x00000038d8000000)
libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000)
libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000)
/lib64/ld-linux-x86-64.so.2 (0x000055e9f8c6c000)
# build/sbin/kexec -v
kexec-tools 2.0.16
However, the *same* kexec executable fails in a non-Xen environment:
# copy xen kexec to .
# ldd ./kexec
linux-vdso.so.1 => (0x00007fffa9da7000)
libxenctrl.so.4.4 => not found
liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x0000003014e00000)
libz.so.1 => /lib64/libz.so.1 (0x000000300ea00000)
libc.so.6 => /lib64/libc.so.6 (0x000000300de00000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x000000300e200000)
/lib64/ld-linux-x86-64.so.2 (0x0000558cc786c000)
# ./kexec -v
./kexec: error while loading shared libraries:
libxenctrl.so.4.4: cannot open shared object file: No such file or directory
At Oracle we "workaround" this by having two kexec-tools packages,
one for Xen and another for non-Xen environments. At Oracle, the
desire is to offer a single kexec-tools package that works in either
environment. To achieve this, kexec-tools would either have to ship
with libxenctrl.so (which we have deemed as unacceptable), or we can
make kexec perform run-time linking against libxenctrl.so.
This patch is one possible way to alleviate the explicit run-time
dependency on libxenctrl.so. This implementation utilizes a set of
macros to wrap calls into libxenctrl.so so that the library can
instead be dlopen() and obtain the function via dlsym() and then
make the call. The advantage of this implementation is that it
requires few changes to the existing kexec-tools code. The dis-
advantage is that it uses macros to remap libxenctrl functions
and do work under the hood.
Another possible implementation worth considering is the approach
taken by libvmi. Reference the following file:
https://github.com/libvmi/libvmi/blob/master/libvmi/driver/xen/libxc_wrapper.h
The libxc_wrapper_t structure definition that starts at line ~33
has members that are function pointers into libxenctrl.so. This
structure is populated once and then later referenced/dereferenced
by the callers of libxenctrl.so members. The advantage of this
implementation is it is more explicit in managing the use of
libxenctrl.so and its versions, but the disadvantage is it would
require touching more of the kexec-tools code.
The following is a list libxenctrl members utilized by kexec:
Functions:
xc_interface_open
xc_kexec_get_range
xc_interface_close
xc_kexec_get_range
xc_interface_open
xc_get_max_cpus
xc_kexec_get_range
xc_version
xc_kexec_exec
xc_kexec_status
xc_kexec_unload
xc_hypercall_buffer_array_create
xc__hypercall_buffer_array_alloc
xc_hypercall_buffer_array_destroy
xc_kexec_load
xc_get_machine_memory_map
Data:
xc__hypercall_buffer_HYPERCALL_BUFFER_NULL
These were identified by configuring and building kexec-tools
with Xen support, but omitting the -lxenctrl from the LDFLAGS
in the Makefile for an x86_64 build.
The above libxenctrl members were referenced via these source
files.
kexec/crashdump-xen.c
kexec/kexec-xen.c
kexec/arch/i386/kexec-x86-common.c
kexec/arch/i386/crashdump-x86.c
This patch provides a wrapper around the calls to the above
functions in libxenctrl.so. Every libxenctrl call must pass a
xc_interface which it obtains from xc_interface_open().
So the existing code is already structured in a manner that
facilitates graceful dlopen()'ing of the libxenctrl.so and
the subsequent dlsym() of the required member.
The patch creates a wrapper function around xc_interface_open()
and xc_interface_close() to perform the dlopen() and dlclose().
For the remaining xc_ functions, this patch defines a macro
of the same name which performs the dlsym() and then invokes
the function. See the __xc_call() macro for details.
There was one data item in libxenctrl.so that presented a
unique problem, HYPERCALL_BUFFER_NULL. It was only utilized
once, as
set_xen_guest_handle(xen_segs[s].buf.h, HYPERCALL_BUFFER_NULL);
I tried a variety of techniques but could not find a general
macro-type solution without modifying xenctrl.h. So the
solution was to declare a local HYPERCALL_BUFFER_NULL, and
this appears to work. I admit I am not familiar with libxenctrl
to state if this is a satisfactory workaround, so feedback
here welcome. I can state that this allows kexec to load/unload/kexec
on Xen and non-Xen environments that I've tested without issue.
With this patch applied, kexec-tools can be built with Xen
support and yet there is no explicit run-time dependency on
libxenctrl.so. Thus it can also be deployed in non-Xen
environments where libxenctrl.so is not installed.
# ldd build/sbin/kexec
linux-vdso.so.1 => (0x00007fff7dbcd000)
liblzma.so.0 => /usr/lib64/liblzma.so.0 (0x00000038d9000000)
libz.so.1 => /lib64/libz.so.1 (0x00000038d6c00000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000038d6400000)
libc.so.6 => /lib64/libc.so.6 (0x00000038d6000000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000038d6800000)
/lib64/ld-linux-x86-64.so.2 (0x0000562dc0c14000)
# build/sbin/kexec -v
kexec-tools 2.0.16
This feature/ability is enabled with the following:
./configure --with-xen=dl
The previous --with-xen=no and --with-xen=yes still work as before.
Not specifying a --with-xen still defaults to --with-xen=yes.
As I've introduced a new build and run-time mode, I've done an
extensive matrix of both build-time and run-time checks of kexec
with this patch applied. The set of build-time scenarios are:
1: configure --with-xen=no and Xen support NOT present
2: configure --with-xen=no and Xen support IS present
3: configure --with-xen=yes and Xen support NOT present
4: configure --with-xen=yes and Xen support IS present
5: configure --with-xen=dl and Xen support NOT present
6: configure --with-xen=dl and Xen support IS present
Xen support present requires that configure can find both
xenctrl.h and libxenctrl.so.
Then for each of the six scenarios above, the corresponding kexec
binary was tested on a Xen system (Oracle's OVS dom0) and a
non-Xen system (Oracle Linux).
There are two build-time checks: did kexec build, and did
it contain libxenctrl.so? The presence of libxenctrl.so
in kexec was checked via ldd. The results were:
Scenario | Build | libxenctrl.so | Result
1 | pass | no | pass - see Note 1
2 | pass | no | pass - see Note 1
3 | pass | no | pass - see Note 2
4 | pass | yes | pass - see Note 3
5 | pass | no | pass - see Note 2
6 | pass | no | pass - see Note 4
Note 1: This passes since due to --with-xen=no, there will
be no Xen support in kexec and therefore no libxenctrl.so a
in the kexec.
Note 2: This passes since while --with-xen=yes, the configure
displays a message indicating that Xen support is disabled,
and allows kexec to build (this is the same behavior as prior
to this patch). And since Xen support is disabled, there is
no libxenctrl.so in the kexec.
Note 3: This passes since with --with-xen=yes and configure
locating the xenctrl.h and libxenctrl.so, support for Xen was
built into kexec. Ldd shows an explicit dependency on the library.
Note 4: This passes since with --with-xen=dl and configure
locating the xenctrl.h and libxencrl.so, support for Xen
was built into kexec. However, this uses the new technique
introduced by this patch and, as a result, ldd shows that the
libxenctrl.so is not a explicit run-time dependency for kexec
(rather libdl.so is now an explicit dependency). This is
precisely the goal of this patch!
The net effect is that there are now three "flavors" of a kexec
binary (prior to this patch there were two): a) kexec with no
support for Xen [scenarios 1, 2, 3, 5], b) kexec with support
for Xen and libxenctrl.so as an explicit dependency [scenario 4],
and c) kexec with support for Xen and libxenctrl.so is NOT an
explicit dependency [scenario 6].
The run-time checks are to take each of the six scenarios above
and run the corresponding kexec binary on both a Xen system and
a non-Xen system. The test for each kexec scenario was:
% service kdump stop
% vi /etc/init.d/kdump
change KEXEC= to /sbin/kexec-[123456]
% service kdump start
# If not FAILED, then below
% service kdump status
Kdump is operational
% rm -fr /var/crash/*
% echo c > /proc/sysrq-trigger
# after reboot verify vmcore generated
% ls -al /var/crash/<tab>
The results were:
Scenario | Xen environment | non-Xen environment
1 | fail - see Note 5 | pass
2 | fail - see Note 5 | pass
3 | fail - see Note 6 | pass
4 | pass | fail - see Note 7
5 | fail - see Note 6 | pass
6 | pass | pass
Note 5: Due to --with-xen=no, kexec lacks support for Xen and will
fail in the Xen environment. This behavior is the same as prior
to this patch.
Note 6: Due to the missing xenctrl.h and libxenctrl.so, kexec was
built without support for Xen, and thus will fail in the Xen
environment. This behavior is the same as prior to this patch.
Note 7: This kexec has the explicit dependency on libxenctrl.so
which prevents it from running in a non-Xen environment. This is
expected as this is the original issue for which this patch is
intended to address.
Note that for scenarios 1, 2, 3 and 5 kexec lacks support for Xen,
thus these versions are expected to "fail" in a Xen environment.
On the flip side, since a non-Xen environment does not need
libxenctrl.so, all but scenario 4 are expected to "pass" in a
non-Xen environment. The results match these expectations!
And, of course, importantly with this patch applied, it did not
have an adverse impact on kexec build or run-time.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
get_kernel_stext_sym() has been defined for both arm and i386. Other
architecture might need some other kernel symbol address. Therefore rewrite
this function as generic function to get any kernel symbol address.
More over, kallsyms is not arch specific representation, therefore have
common function for all arches.
Signed-off-by: Pratyush Anand <panand@redhat.com>
[created symbols.c]
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Tested-by: David Woodhouse <dwmw@amazon.co.uk>
Tested-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
We got a problem on one SGI 64TB machine, the current kexec-tools
failed to work due to the insufficient ranges(MAX_MEMORY_RANGES)
allowed which is defined as 1024(less than the ranges on the machine).
The kcore header is insufficient due to the same reason as well.
To solve this, this patch simply doubles "MAX_MEMORY_RANGES" and
"KCORE_ELF_HEADERS_SIZE".
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Tested-by: Frank Ramsay <frank.ramsay@hpe.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Fedora koji uses gcc version 7.0.1-0.12.fc27, and it generates a build
warning
kexec/arch/i386/kexec-elf-x86.c:299:3: error: format not a string
literal and no format arguments [-Werror=format-security]
die(error_msg);
^~~
cc1: some warnings being treated as errors
error_msg can have a format specifier as well in string. In such cases,
if there is no other arguments for the format variable then code will
try to access a non existing argument. Therefore, use 1st argument as
format specifier for string print and pass error_msg as the string to be
printed.
While doing that,also use const qualifier before "char *error_msg".
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
MUSL doesn't support %L except for floating-point arguments; therefore,
%ll must be used instead with integer arguments.
Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
/proc/kallsyms
Kernel symbol page_offset_base could be unavailable when mm KASLR code is
not compiled in kernel. It's inappropriate to print out error message
when failed to search for page_offset_base from /proc/kallsyms. Seems now
there is not a way to find out if mm KASLR is compiled in or not. An
alternative approach is only printing out debug message in get_kernel_sym
if failed to search a expected kernel symbol.
Do it in this patch, a simple fix.
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Pratyush Anand <panand@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
I got below error while tesing kexec -p:
"Can't find kernel text map area from kcore"
The case is the pt_load start addr was same as stext_sym. The checking
code should really be saddr <= stext_sym so that the right pt_load area
includes stext_sym can be matched.
This was not reported by people previously because it will fail over to
use hardcode X86_64__START_KERNEL_map to match the pt_load areas again
in later code and it sometimes succeeds because of kernel address
randomization.
With this change according to my test stext_sym checking can garantee
falling into right pt_load area if we get correct stext_sym.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Seems that Xen actually checks for some zones to be 'reserved' and
complains if they are not.
This also matches what the bios uses at boot.
Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Signed-off-by: Sylvain Munaut <s.munaut@whatever-company.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Multiple changes were made on KASLR (right now in linux-next). One of
them is randomizing the virtual address of the physical mapping, vmalloc
and vmemmap memory sections. It breaks kdump ability to read physical
memory.
This change identifies if KASLR memories randomization is used by
checking if the page_offset_base variable exists. It search for the
correct PAGE_OFFSET value by looking at the loaded memory section and
find the lowest aligned on PUD (the randomization level).
Related commits on linux-next:
- 0483e1fa6e09d4948272680f691dccb1edb9677f: Base for randomization
- 021182e52fe01c1f7b126f97fd6ba048dc4234fd: Enable for PAGE_OFFSET
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
On PowerPC64 ABIv2 we need to look at the symbol to determine
if it has a local entry point. Pass struct mem_sym into
machine_apply_elf_rel() so we can.
Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
This reverts commit 8a1aa35a1077b42bc2a2afb05d24b637e1edf2a1.
|
|
Crash kernel region size is available via sysfs on Linux running on
bare metal. However, this does not work when Linux runs as Xen dom0.
In this case Xen crash kernel region size should be established using
__HYPERVISOR_kexec_op hypercall (Linux kernel kexec functionality does
not make a lot of sense in Xen dom0). Sadly hypercalls are not easily
accessible using shell scripts or something like that. Potentially we
can check "xl dmesg" output for crashkernel option but this is not nice.
So, let's add this functionality, for Linux running on bare metal and
as Xen dom0, to kexec-tools. This way kdump scripts may establish crash
kernel region size in one way regardless of platform. All burden of
platform detection lies on kexec-tools.
Figure (and unit) displayed by this new kexec-tools functionality is
the same as one taken from /sys/kernel/kexec_crash_size.
This functionality is available on x86 platform only. If idea is acceptable
then I can prepare patches for other platforms (if it is possible and make
sense) and repost them as fully flagged patch series.
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
It appears that (older?) revisions of xenctl.h define
all of the E820_* values used in kexec-x86-common.c except
E820_PMAM and E820_PMEM. This results in a build failure when
building against libxenctl.
Avoid this problem by providing local definitions of those values.
It seems reasonable to do so in the kexec-x86-common.c as
currently that is the only source file that uses the values in question.
Fixes: 56a12abc1df1 ("kexec: fix mmap return code handling")
Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Cc: Petr Tesarik <ptesarik@suse.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
kexec-tools build fails on my laptop with RHEL7.1 installed:
gcc -g -O2 -fno-strict-aliasing -Wall -Wstrict-prototypes -I./include -I./util_lib/include -Iinclude/ -I./kexec/arch/x86_64/include -c -MD -o kexec/arch/i386/kexec-x86-common.o kexec/arch/i386/kexec-x86-common.c
In file included from kexec/arch/i386/kexec-x86-common.c:36:0:
kexec/arch/i386/../../kexec.h:19:2: error: #error BYTE_ORDER not defined
#error BYTE_ORDER not defined
^
kexec/arch/i386/../../kexec.h:23:2: error: #error LITTLE_ENDIAN not defined
#error LITTLE_ENDIAN not defined
^
kexec/arch/i386/../../kexec.h:27:2: error: #error BIG_ENDIAN not defined
#error BIG_ENDIAN not defined
^
In file included from kexec/arch/i386/kexec-x86-common.c:37:0:
kexec/arch/i386/../../kexec-syscall.h: In function ‘kexec_load’:
kexec/arch/i386/../../kexec-syscall.h:74:2: warning: implicit declaration of function ‘syscall’ [-Wimplicit-function-declaration]
return (long) syscall(__NR_kexec_load, entry, nr_segments, segments, flags);
^
make: *** [kexec/arch/i386/kexec-x86-common.o] Error 1
The build error was introduced by below commit:
commit c9c21cc107dcc9b6053e39ead1069e03717513f9
Author: Baoquan He <bhe@redhat.com>
Date: Thu Aug 6 19:10:55 2015 +0800
kexec: use _DEFAULT_SOURCE instead to remove compiling warning
Now compiling will print warning like below. Change code as it suggested.
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
^
See manpage: http://man7.org/linux/man-pages/man7/feature_test_macros.7.html
_BSD_SOURCE has been deprecated since glibc 2.20, To allow code that requires
_BSD_SOURCE in glibc 2.19 and earlier and _DEFAULT_SOURCE in glibc 2.20 and
later to compile without warnings, define both _BSD_SOURCE and _DEFAULT_SOURCE.
Thus fix it by adding back _BSD_SOURCE along with _DEFAULT_SOURCE.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
There may be more than one crash kernel regions on x86. Currently,
kexec-tools picks the largest one. If high reservation is smaller
than low, it will try to load panic kernel low. However, the kexec
syscall checks that target address is within crashk_res boundaries,
so attempts to load crash kernel low result in -EADDRNOTAVAIL, and
kexec prints out this error message:
kexec_load failed: Cannot assign requested address
Looking at the logic in arch/x86/kernel/setup.c, there are only two
possible layouts:
1. crashk_res is below 4G, and there is only one region,
2. crashk_res is above 4G, and crashk_low_res is below 4G
In either case, kexec-tools must pick the highest region.
Changelog:
* v3: rename function to get_crash_kernel_load_range
* v2: remove unnecessary local variables
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Kernel add E820_PRAM or E820_PMEM type for NVDIMM memory device.
Now support them in kexec too.
Reported-by: Toshi Kani <toshi.kani@hp.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Now compiling will print warning like below. Change code as it suggested.
# warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
^
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
gcc 4.9.1 tells me this variable is set but unused
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Commit 4362bfac changes params for kexec_iomem_for_each_line from
'unsigned long' to 'unsigned long long'.
This patch fixes forgotten changes for sh and x86 archs.
Bug causes incorrect parsing of memory ranges.
Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: kexec@lists.infradead.org
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
kexec/arch/i386/kexec-elf-x86.c:97:6: warning: variable
‘modified_cmdline_len’ set but not used [-Wunused-but-set-variable]
int modified_cmdline_len;
Signed-off-by: Ameya Palande <2ameya@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
kexec/arch/i386/kexec-bzImage.c:111:8: warning: variable
‘kernel_version’ set but not used [-Wunused-but-set-variable]
Signed-off-by: Ameya Palande <2ameya@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
If "--command-line" option is not specified, then kexec segfaults while
dereferencing NULL command line string pointer. While we are at it, also
fix indentation and use '{' and '}' consistently.
Signed-off-by: Ameya Palande <2ameya@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
If kernel does not export efi runtime maps it means 1:1 mapping does not
work or user explictly boot with efi=old_map. In this case efi setup code
will failback to noefi boot, but for kdump case we still need pass extra
acpi_rsdp cmdline.
Thus adding a check in kdump path.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
There's one more '-' in arch_usage, thus
s/pass--memmap-cmdline/pass-memmap-cmdline
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
For kernel boot with efi=old_map or some quirked machines like SGI UV
they use old ioremap instead of 1:1 mapping. But kexec efi support
depends on the 1:1 mapping thus we need to switch to use the old way
There's a kernel patch for exporting the efi flags so we can check the memory
mapping method. But user may want to explictly disable efi boot for unknown
reasons.
So here add a new arch option '--noefi' for this case.
Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In kdump path, now we store all the 2nd kernel memory ranges in
memmap_p. We could use just cmdline_add_memmap() to add all types of
memory ranges to 2nd kernel cmdline.
So clean up here, merge cmdline_add_memmap_acpi() into
cmdline_add_memmap().
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
I accidentally add one duplicate line. Now remove it.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In commit 91f5b9c ("kdump: pass e820 reserved region to 2nd kernel via
e820 table or setup_data"), I made a wrong condition check.
We should only add cmdline for a memory range if --pass-memmap-cmdline
and the range type isn't RANGE_RESERVED.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
e820 reserved region could be useful in 2nd kernel.
For example, PCI mmconf (extended mode) requires reserved region
otherwise it falls back to legacy mode. The following log is from Cliff
Wickman <cpw@sgi.com>:
PCI: MMCONFIG for domain 1003 [bus 3f-3f] at [mem 0xff0ff00000-0xff0fffffff] (base 0xff0c000000)
[Firmware Bug]: PCI: MMCONFIG at [mem 0x80000000-0x80cfffff] not reserved in ACPI motherboard resources
PCI: not using MMCONFIG
PCI devices on segment 1 (>0) can't fall back to legacy mode, thus
kernel probing fails and device can't be found.
We don't pass reserved region because these regions could be too much
and eat up our very limited kernel command line resource in
memmap=exactmap case.
However now we use e820 map and setup_data to pass memory map to 2nd
kernel and the number of reserved regions should not be a problem any
more.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
command line size is restricted by kernel, sometimes memmap=exactmap has
too many memory ranges to pass to cmdline. And also memmap=exactmap and
kASLR doesn't work together.
A better approach, to pass the memory ranges for crash kernel to boot
into, is filling the memory ranges into E820.
boot_params only got 128 slots for E820 map to fit in, when the number of
memory map exceeds 128, use setup_data to pass the rest as extended E820
memory map.
kexec boot could also benefit from setup_data in case E820 memory map
exceeds 128.
Now this new approach becomes default instead of memmap=exactmap.
saved_max_pfn users can specify --pass-memmap-cmdline to use the
exactmap approach.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Reviewed-by: Linn Crosetto <linn@hp.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
--pass-memmap-cmdline is used for pass memmap=exactmap cmdline for 2nd
kernel. Later we will use this option to disable passing E820 memmap
method but use the old exactmap method.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Later kexec and kdump memory range will be mapped to E820entry. But
currently kexec memory range .end field is exclusive while crash memory
range is inclusive.
Given the fact that the exported proc iomem and sysfs memmap are both
inclusive, change kexec memory range .end to be inclusive. Later the
unified memory range of both kexec and kdump can use the same E820
filling code.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Add two new members to kexec_info structure:
struct memory_range *crash_range
int nr_crash_ranges;
crash_range contains the memory ranges used to boot 2nd kernel.
nr_crash_ranges contains the count of the crash memory ranges.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
CRASH_MAX_MEMMAP_NR is used as the upper boundary of memmap_p.
Originally memmap_p was used to store RANGE_RAM only. But now we changed
it to store all the types of memory ranges for 2nd kernel, which
includes RANGE_RAM, RANGE_ACPI, RANGE_ACPI_NVS (and RANGE_RESERVED in
the future).
Currently CRASH_MAX_MEMMAP_NR is defined (KEXEC_MAX_SEGMENTS + 2), which
is not enough for memmap_p. It must be increased to a much higher value.
I think 1024 is good enough for storing all memory ranges for 2nd
kernel. So this patch increases CRASH_MAX_MEMMAP_NR to 1024.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In load_crashdump_segments(), memmap_p[] is used to contain RANGE_RAM
memory range for booting 2nd kernel. Now adding types of RANGE_ACPI and
RANGE_ACPI_NVS to memmap_p, so later we can pass all the types of memory
range to 2nd kernel. These all types of memory ranges are all stored in
memmap_p for later reference.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
add_memmap() will also add memory range with type RANGE_ACPI and
RANGE_ACPI_NVS (RANGE_RESERVED in the future) besides RANGE_RAM to
memmap_p.
Among these types of memory range, only RANGE_RAM needs to
be aligned with certain alignment. RANGE_ACPI, RANGE_ACPI_NVS and
RANGE_RESERVED doesn't have to be aligned.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
This change will be used later:
add_memmap(.., int *nr_memmap, .., int type);
delete_memmap(.., int *nr_memmap, ..);
memmap_p[] is statically allocated for a certain amount. It will be used
later when mapping these memory maps to e820 map.
It's convenient to keep track of the count of memmap_p (nr_memmap) in
add_memmap and delete_memmap, because the counting has already been
taken care of in these two functions.
The original add_memmap() can only add memory range of RANGE_RAM type.
For adding other types of memory range, add another argument for
indicating the type.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Dave Young <dyoung@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
add_setup_data() is used to add an instance to the single linked list
of setup_data structure.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Signed-off-by: WANG Chao <chaowang@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
When kASLR is enabled (CONFIG_RANDOMIZED_BASE=y), kernel text mapping
base is randomized. The max base offset of such randomization is
configured at compile time through CONFIG_RANDOMIZE_MAX_BASE_OFFSET (by
default 1G).
Currently kexec-tools is using hard code macro X86_64__START_KERNEL_map
(0xffffffff80000000) and X86_64_KERNEL_TEXT_SIZE (512M) to determine
kernel text mapping from kcore's PT_LOAD. With kASLR, the mapping is
changed as the following:
ffffffff80000000 - (ffffffff80000000+CONFIG_RANDOMIZE_BASE_MAX_OFFSET)
As Vivek suggested, we can get _stext kernel symbol address from
/proc/kallsyms, and search for kcore's PT_LOAD which contains _stext,
and we can say that this area represents the kernel mapping area.
Let's first use this way to find out kernel text mapping. If failed for
whatever reason, fall back to use the old way.
Suggested-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: WANG Chao <chaowang@redhat.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
On non-EFI systems, efi_info section of boot_params is zero filled resulting
in an erroneous message from kexec regarding "efi memory descriptor" version.
Caused by commit: e1ffc9e9a0769e1f54185003102e9bec428b84e8 "Passing efi related
data via setup_data"
0000700 0000 0000 0000 0000 0000 0000 0000 0000
0000720 0000 0000 0000 0000 0000 0000 0000 0000
0000740
efi memory descriptor version 0 is not supported!
Signed-off-by: Tony Jones <tonyj@suse.de>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
In case using crashkernel=xM,high crashkernel memory will be allocated from top to down
Thus the usable memory for kdump kernel could be bigger than 4G. The efi memmap value
is two 32 bit values efi_memmap and efi_memmap_hi, previously I only passed the efi_memmap
so for the high memory address there will be below kernel panic:
[ 0.000000] efi: EFI v2.31 by American Megatrends
[ 0.000000] efi: ACPI 2.0=0xdb752000 SMBIOS=0xdbab4b98 ACPI=0xdb752000 MPS=0xf4bd0
[ 0.000000] efi: mem00: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935)
[ 0.000000] efi: mem01: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935)
[ 0.000000] efi: mem02: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935)
[ 0.000000] efi: mem03: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935)
[ 0.000000] efi: mem04: type=4294967295, attr=0xffffffffffffffff, range=[0xffffffffffffffff-0xffffffffffffefff) (72057594037927935)
[ 0.000000] SMBIOS 2.7 present.
[snip]
[ 0.082451] BUG: unable to handle kernel paging request at ffffa3d0f0000000
[ 0.089467] IP: [<ffffffff810513d1>] native_set_pte+0x1/0x10
[ 0.095157] PGD 0
[ 0.097197] Oops: 0002 [#1] SMP
[ 0.100466] Modules linked in:
[ 0.103554] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc7 #157
[ 0.110001] Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS J61 v03.15 05/09/2013
[ 0.118697] task: ffffffff818e1460 ti: ffffffff818ce000 task.ti: ffffffff818ce000
[ 0.126181] RIP: 0010:[<ffffffff810513d1>] [<ffffffff810513d1>] native_set_pte+0x1/0x10
[ 0.134296] RSP: 0000:ffffffff818cfc80 EFLAGS: 00010287
[ 0.139609] RAX: 0000000000000000 RBX: ffffa3d0f0000000 RCX: 00003ffffffff000
[ 0.146744] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffffa3d0f0000000
[ 0.153879] RBP: ffffffff818cfcb8 R08: ffffea0010745d20 R09: 0000000000000000
[ 0.161013] R10: ffff88041f731fc0 R11: 000000000000001e R12: 0000000000200000
[ 0.168148] R13: 0000000000000000 R14: 0000000000400000 R15: ffff880000000008
[ 0.175288] FS: 0000000000000000(0000) GS:ffff88041f200000(0000) knlGS:0000000000000000
[ 0.183377] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.189125] CR2: ffffa3d0f0000000 CR3: 000000041e8da000 CR4: 00000000000406b0
[ 0.196264] Stack:
[ 0.198283] ffffffff818cfcb8 ffffffff810561d7 ffff880000000008 0000000000400000
[ 0.205746] ffff880000001000 00000000000001ff ffff88041e8de000 ffffffff818cfd00
[ 0.213210] ffffffff8105644e 0000000000200000 0000000040000000 00000000ffffffff
[ 0.220676] Call Trace:
[ 0.223130] [<ffffffff810561d7>] ? unmap_pte_range+0x77/0x110
[ 0.228966] [<ffffffff8105644e>] unmap_pmd_range+0xde/0x210
[ 0.234630] [<ffffffff81056c6b>] __cpa_process_fault+0x48b/0x5e0
[ 0.240730] [<ffffffff81057276>] __change_page_attr_set_clr+0x4b6/0xb10
[ 0.247437] [<ffffffff810557c7>] ? __ioremap_caller+0x277/0x360
[ 0.253454] [<ffffffff810589f1>] kernel_map_pages_in_pgd+0x71/0xa0
[ 0.259736] [<ffffffff81a53361>] __map_region+0x45/0x63
[ 0.265051] [<ffffffff81a535cc>] efi_map_region_fixed+0xd/0xf
[ 0.270886] [<ffffffff81a52f19>] efi_enter_virtual_mode+0x5a/0x3d9
[ 0.277162] [<ffffffff81a77516>] ? acpi_enable_subsystem+0x37/0x90
[ 0.283440] [<ffffffff81a36eb9>] start_kernel+0x386/0x41c
[ 0.288931] [<ffffffff81a3693c>] ? repair_env_string+0x5c/0x5c
[ 0.294852] [<ffffffff81a36120>] ? early_idt_handlers+0x120/0x120
[ 0.301035] [<ffffffff81a365ee>] x86_64_start_reservations+0x2a/0x2c
[ 0.307479] [<ffffffff81a3672e>] x86_64_start_kernel+0x13e/0x14d
[ 0.313572] Code: 66 2e 0f 1f 84 00 00 00 00 00 48 8b 46 18 55 48 89 e5 48 89 47 04 5d c3 66 90 55 48 89 e5 0f 01 f8 5d c3 0f 1f 8
[ 0.333545] RIP [<ffffffff810513d1>] native_set_pte+0x1/0x10
[ 0.339312] RSP <ffffffff818cfc80>
[ 0.342807] CR2: ffffa3d0f0000000
[ 0.346141] ---[ end trace 86088f739725b8c6 ]---
[ 0.350760] Kernel panic - not syncing: Fatal exception
Fix this by passing both efi_memmap and efi_memmap_hi to 2nd kernel.
Reported-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Dave Young <dyoung@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
dbgprint_mem_range is used for printing the given memory range under
debugging mode.
Signed-off-by: WANG Chao <chaowang@redhat.com>
Tested-by: Linn Crosetto <linn@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
At least on our systems, xenctrl.h defines (unguarded) struct e820entry
Move the (guarded) definition in include/x86/x86-linux.h to below.
Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
Commit 9c200a85de2245a850546fded96a1977b84ad24d referenced
'bzImage_support_efi_boot' without matching 32-bit definition.
Signed-off-by: Tony Jones <tonyj@suse.de>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
For supporting efi runtime, several efi physical addresses
fw_vendor, runtime, config tables, smbios and the whole runtime
mapping info need to be used in kexec kernel. Thus introduce
setup_data struct for passing these data.
collect the varialbes from /sys/firmware/efi/systab and
/sys/firmware/efi/runtime-map
Signed-off-by: Dave Young <dyoung@redhat.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
|
For supporting efi runtime on kexec kernel we need to
fill the efi_info struct in setup_header. I just get
the info in kernel exported boot_params data in debugfs.
Signed-off-by: Dave Young <dyoung@redhat.com>
Tested-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
|