|author||Kai Huang <firstname.lastname@example.org>||2021-03-25 22:30:57 +1300|
|committer||Borislav Petkov <email@example.com>||2021-03-26 22:51:23 +0100|
x86/sgx: Wipe out EREMOVE from sgx_free_epc_page()
EREMOVE takes a page and removes any association between that page and an enclave. It must be run on a page before it can be added into another enclave. Currently, EREMOVE is run as part of pages being freed into the SGX page allocator. It is not expected to fail, as it would indicate a use-after-free of EPC pages. Rather than add the page back to the pool of available EPC pages, the kernel intentionally leaks the page to avoid additional errors in the future. However, KVM does not track how guest pages are used, which means that SGX virtualization use of EREMOVE might fail. Specifically, it is legitimate that EREMOVE returns SGX_CHILD_PRESENT for EPC assigned to KVM guest, because KVM/kernel doesn't track SECS pages. To allow SGX/KVM to introduce a more permissive EREMOVE helper and to let the SGX virtualization code use the allocator directly, break out the EREMOVE call from the SGX page allocator. Rename the original sgx_free_epc_page() to sgx_encl_free_epc_page(), indicating that it is used to free an EPC page assigned to a host enclave. Replace sgx_free_epc_page() with sgx_encl_free_epc_page() in all call sites so there's no functional change. At the same time, improve the error message when EREMOVE fails, and add documentation to explain to the user what that failure means and to suggest to the user what to do when this bug happens in the case it happens. [ bp: Massage commit message, fix typos and sanitize text, simplify. ] Signed-off-by: Kai Huang <firstname.lastname@example.org> Signed-off-by: Borislav Petkov <email@example.com> Reviewed-by: Jarkko Sakkinen <firstname.lastname@example.org> Link: https://email@example.com
Diffstat (limited to 'Documentation/x86')
1 files changed, 25 insertions, 0 deletions
diff --git a/Documentation/x86/sgx.rst b/Documentation/x86/sgx.rst
index eaee1368b4fd..f90076e67cde 100644
@@ -209,3 +209,28 @@ An application may be loaded into a container enclave which is specially
configured with a library OS and run-time which permits the application to run.
The enclave run-time and library OS work together to execute the application
when a thread enters the enclave.
+Impact of Potential Kernel SGX Bugs
+When EPC page leaks happen, a WARNING like this is shown in dmesg:
+"EREMOVE returned ... and an EPC page was leaked. SGX may become unusable..."
+This is effectively a kernel use-after-free of an EPC page, and due
+to the way SGX works, the bug is detected at freeing. Rather than
+adding the page back to the pool of available EPC pages, the kernel
+intentionally leaks the page to avoid additional errors in the future.
+When this happens, the kernel will likely soon leak more EPC pages, and
+SGX will likely become unusable because the memory available to SGX is
+limited. However, while this may be fatal to SGX, the rest of the kernel
+is unlikely to be impacted and should continue to work.
+As a result, when this happpens, user should stop running any new
+SGX workloads, (or just any new workloads), and migrate all valuable
+workloads. Although a machine reboot can recover all EPC memory, the bug
+should be reported to Linux developers.