summaryrefslogtreecommitdiff
path: root/include/linux/memory-failure.h
diff options
context:
space:
mode:
authorAnkit Agrawal <ankita@nvidia.com>2025-11-02 18:44:33 +0000
committerAndrew Morton <akpm@linux-foundation.org>2025-11-16 17:28:29 -0800
commit2ec41967189cd65a8f79c760dd1b50c4f56e8ac6 (patch)
tree0bcd90d6c72205f82e27af812fbb674aad012ae0 /include/linux/memory-failure.h
parent30d0a1291046a3641b5d9d547591228ad9c6aae0 (diff)
mm: handle poisoning of pfn without struct pages
Poison (or ECC) errors can be very common on a large size cluster. The kernel MM currently does not handle ECC errors / poison on a memory region that is not backed by struct pages. If a memory region mapped using remap_pfn_range() for example, but not added to the kernel, MM will not have associated struct pages. Add a new mechanism to handle memory failure on such memory. Make kernel MM expose a function to allow modules managing the device memory to register the device memory SPA and the address space associated it. MM maintains this information as an interval tree. On poison, MM can search for the range that the poisoned PFN belong and use the address_space to determine the mapping VMA. In this implementation, kernel MM follows the following sequence that is largely similar to the memory_failure() handler for struct page backed memory: 1. memory_failure() is triggered on reception of a poison error. An absence of struct page is detected and consequently memory_failure_pfn() is executed. 2. memory_failure_pfn() collects the processes mapped to the PFN. 3. memory_failure_pfn() sends SIGBUS to all the processes mapping the faulty PFN using kill_procs(). Note that there is one primary difference versus the handling of the poison on struct pages, which is to skip unmapping to the faulty PFN. This is done to handle the huge PFNMAP support added recently [1] that enables VM_PFNMAP vmas to map at PMD or PUD level. A poison to a PFN mapped in such as way would need breaking the PMD/PUD mapping into PTEs that will get mirrored into the S2. This can greatly increase the cost of table walks and have a major performance impact. Link: https://lore.kernel.org/all/20240826204353.2228736-1-peterx@redhat.com/ [1] Link: https://lkml.kernel.org/r/20251102184434.2406-3-ankita@nvidia.com Signed-off-by: Ankit Agrawal <ankita@nvidia.com> Cc: Aniket Agashe <aniketa@nvidia.com> Cc: Borislav Betkov <bp@alien8.de> Cc: David Hildenbrand <david@redhat.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Kirti Wankhede <kwankhede@nvidia.com> Cc: Len Brown <lenb@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Matthew R. Ochs <mochs@nvidia.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Neo Jia <cjia@nvidia.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuai Xue <xueshuai@linux.alibaba.com> Cc: Smita Koralahalli Channabasappa <smita.koralahallichannabasappa@amd.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tarun Gupta <targupta@nvidia.com> Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Cc: Vikram Sethi <vsethi@nvidia.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhi Wang <zhiw@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'include/linux/memory-failure.h')
-rw-r--r--include/linux/memory-failure.h17
1 files changed, 17 insertions, 0 deletions
diff --git a/include/linux/memory-failure.h b/include/linux/memory-failure.h
new file mode 100644
index 000000000000..bc326503d2d2
--- /dev/null
+++ b/include/linux/memory-failure.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_MEMORY_FAILURE_H
+#define _LINUX_MEMORY_FAILURE_H
+
+#include <linux/interval_tree.h>
+
+struct pfn_address_space;
+
+struct pfn_address_space {
+ struct interval_tree_node node;
+ struct address_space *mapping;
+};
+
+int register_pfn_address_space(struct pfn_address_space *pfn_space);
+void unregister_pfn_address_space(struct pfn_address_space *pfn_space);
+
+#endif /* _LINUX_MEMORY_FAILURE_H */