1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
|
What: /sys/bus/edac/devices/<dev-name>/mem_repairX
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
The sysfs EDAC bus devices /<dev-name>/mem_repairX subdirectory
pertains to the memory media repair features control, such as
PPR (Post Package Repair), memory sparing etc, where <dev-name>
directory corresponds to a device registered with the EDAC
device driver for the memory repair features.
Post Package Repair is a maintenance operation requests the memory
device to perform a repair operation on its media. It is a memory
self-healing feature that fixes a failing memory location by
replacing it with a spare row in a DRAM device. For example, a
CXL memory device with DRAM components that support PPR features may
implement PPR maintenance operations. DRAM components may support
two types of PPR functions: hard PPR, for a permanent row repair, and
soft PPR, for a temporary row repair. Soft PPR may be much faster
than hard PPR, but the repair is lost with a power cycle.
The sysfs attributes nodes for a repair feature are only
present if the parent driver has implemented the corresponding
attr callback function and provided the necessary operations
to the EDAC device driver during registration.
In some states of system configuration (e.g. before address
decoders have been configured), memory devices (e.g. CXL)
may not have an active mapping in the main host address
physical address map. As such, the memory to repair must be
identified by a device specific physical addressing scheme
using a device physical address(DPA). The DPA and other control
attributes to use will be presented in related error records.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/repair_type
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RO) Memory repair type. For eg. post package repair,
memory sparing etc. Valid values are:
- ppr - Post package repair.
- cacheline-sparing
- row-sparing
- bank-sparing
- rank-sparing
- All other values are reserved.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/persist_mode
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) Get/Set the current persist repair mode set for a
repair function. Persist repair modes supported in the
device, based on a memory repair function, either is temporary,
which is lost with a power cycle or permanent. Valid values are:
- 0 - Soft memory repair (temporary repair).
- 1 - Hard memory repair (permanent repair).
- All other values are reserved.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/repair_safe_when_in_use
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RO) True if memory media is accessible and data is retained
during the memory repair operation.
The data may not be retained and memory requests may not be
correctly processed during a repair operation. In such case
repair operation can not be executed at runtime. The memory
must be taken offline.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/hpa
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) Host Physical Address (HPA) of the memory to repair.
The HPA to use will be provided in related error records.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/dpa
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) Device Physical Address (DPA) of the memory to repair.
The specific DPA to use will be provided in related error
records.
In some states of system configuration (e.g. before address
decoders have been configured), memory devices (e.g. CXL)
may not have an active mapping in the main host address
physical address map. As such, the memory to repair must be
identified by a device specific physical addressing scheme
using a DPA. The device physical address(DPA) to use will be
presented in related error records.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/nibble_mask
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) Read/Write Nibble mask of the memory to repair.
Nibble mask identifies one or more nibbles in error on the
memory bus that produced the error event. Nibble Mask bit 0
shall be set if nibble 0 on the memory bus produced the
event, etc. For example, CXL PPR and sparing, a nibble mask
bit set to 1 indicates the request to perform repair
operation in the specific device. All nibble mask bits set
to 1 indicates the request to perform the operation in all
devices. Eg. for CXL memory repair, the specific value of
nibble mask to use will be provided in related error records.
For more details, See nibble mask field in CXL spec ver 3.1,
section 8.2.9.7.1.2 Table 8-103 soft PPR and section
8.2.9.7.1.3 Table 8-104 hard PPR, section 8.2.9.7.1.4
Table 8-105 memory sparing.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/min_hpa
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/max_hpa
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/min_dpa
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/max_dpa
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) The supported range of memory address that is to be
repaired. The memory device may give the supported range of
attributes to use and it will depend on the memory device
and the portion of memory to repair.
The userspace may receive the specific value of attributes
to use for a repair operation from the memory device via
related error records and trace events, for eg. CXL DRAM
and CXL general media error records in CXL memory devices.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/bank_group
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/bank
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/rank
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/row
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/column
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/channel
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/sub_channel
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(RW) The control attributes for the memory to be repaired.
The specific value of attributes to use depends on the
portion of memory to repair and will be reported to the host
in related error records and be available to userspace
in trace events, such as CXL DRAM and CXL general media
error records of CXL memory devices.
When readng back these attributes, it returns the current
value of memory requested to be repaired.
bank_group - The bank group of the memory to repair.
bank - The bank number of the memory to repair.
rank - The rank of the memory to repair. Rank is defined as a
set of memory devices on a channel that together execute a
transaction.
row - The row number of the memory to repair.
column - The column number of the memory to repair.
channel - The channel of the memory to repair. Channel is
defined as an interface that can be independently accessed
for a transaction.
sub_channel - The subchannel of the memory to repair.
The requirement to set these attributes varies based on the
repair function. The attributes in sysfs are not present
unless required for a repair function.
For example, CXL spec ver 3.1, Section 8.2.9.7.1.2 Table 8-103
soft PPR and Section 8.2.9.7.1.3 Table 8-104 hard PPR operations,
these attributes are not required to set. CXL spec ver 3.1,
Section 8.2.9.7.1.4 Table 8-105 memory sparing, these attributes
are required to set based on memory sparing granularity.
What: /sys/bus/edac/devices/<dev-name>/mem_repairX/repair
Date: March 2025
KernelVersion: 6.15
Contact: linux-edac@vger.kernel.org
Description:
(WO) Issue the memory repair operation for the specified
memory repair attributes. The operation may fail if resources
are insufficient based on the requirements of the memory
device and repair function.
- 1 - Issue the repair operation.
- All other values are reserved.
|