1NDCTL-INJECT-ERROR(1) NDCTL-INJECT-ERROR(1)
2
3
4
6 ndctl-inject-error - inject media errors at a namespace offset
7
9 ndctl inject-error <namespace> [<options>]
10
12 The capacity of an NVDIMM REGION (contiguous span of persistent memory)
13 is accessed via one or more NAMESPACE devices. REGION is the Linux term
14 for what ACPI and UEFI call a DIMM-interleave-set, or a
15 system-physical-address-range that is striped (by the memory
16 controller) across one or more memory modules.
17
18 The UEFI specification defines the NVDIMM Label Protocol as the
19 combination of label area access methods and a data format for
20 provisioning one or more NAMESPACE objects from a REGION. Note that
21 label support is optional and if Linux does not detect the label
22 capability it will automatically instantiate a "label-less" namespace
23 per region. Examples of label-less namespaces are the ones created by
24 the kernel’s memmap=ss!nn command line option (see the nvdimm wiki on
25 kernel.org), or NVDIMMs without a valid namespace index in their label
26 area.
27
28 Note
29 Label-less namespaces lack many of the features of their label-rich
30 cousins. For example, their size cannot be modified, or they cannot
31 be fully destroyed (i.e. the space reclaimed). A destroy operation
32 will zero any mode-specific metadata. Finally, for create-namespace
33 operations on label-less namespaces, ndctl bypasses the region
34 capacity availability checks, and always satisfies the request
35 using the full region capacity. The only reconfiguration operation
36 supported on a label-less namespace is changing its mode.
37
38 A namespace can be provisioned to operate in one of 4 modes, fsdax,
39 devdax, sector, and raw. Here are the expected usage models for these
40 modes:
41
42 • fsdax: Filesystem-DAX mode is the default mode of a namespace when
43 specifying ndctl create-namespace with no options. It creates a
44 block device (/dev/pmemX[.Y]) that supports the DAX capabilities of
45 Linux filesystems (xfs and ext4 to date). DAX removes the page
46 cache from the I/O path and allows mmap(2) to establish direct
47 mappings to persistent memory media. The DAX capability enables
48 workloads / working-sets that would exceed the capacity of the page
49 cache to scale up to the capacity of persistent memory. Workloads
50 that fit in page cache or perform bulk data transfers may not see
51 benefit from DAX. When in doubt, pick this mode.
52
53 • devdax: Device-DAX mode enables similar mmap(2) DAX mapping
54 capabilities as Filesystem-DAX. However, instead of a block-device
55 that can support a DAX-enabled filesystem, this mode emits a single
56 character device file (/dev/daxX.Y). Use this mode to assign
57 persistent memory to a virtual-machine, register persistent memory
58 for RDMA, or when gigantic mappings are needed.
59
60 • sector: Use this mode to host legacy filesystems that do not
61 checksum metadata or applications that are not prepared for torn
62 sectors after a crash. Expected usage for this mode is for small
63 boot volumes. This mode is compatible with other operating systems.
64
65 • raw: Raw mode is effectively just a memory disk that does not
66 support DAX. Typically this indicates a namespace that was created
67 by tooling or another operating system that did not know how to
68 create a Linux fsdax or devdax mode namespace. This mode is
69 compatible with other operating systems, but again, does not
70 support DAX operation.
71
72 ndctl-inject-error can be used to ask the platform to simulate media
73 errors in the NVDIMM address space to aid debugging and development of
74 features related to error handling.
75
76 By default, injecting an error actually only injects an error to the
77 first n bytes of the block, where n is the output of
78 ndctl_cmd_ars_cap_get_size(). In other words, we only inject one
79 ars_unit per sector. This is sufficient for Linux to mark the whole
80 sector as bad, and will show up as such in the various badblocks lists
81 in the kernel. If multiple blocks are being injected, only the first n
82 bytes of each block specified will be injected as errors. This can be
83 overridden by the --saturate option, which will force the entire block
84 to be injected as an error.
85
86 Warning
87 These commands are DANGEROUS and can cause data loss. They are only
88 provided for testing and debugging purposes.
89
91 Inject errors in namespace0.0 at block 12 for 2 blocks (i.e. 12, 13)
92
93 ndctl inject-error --block=12 --count=2 namespace0.0
94
95 Check status of injected errors on namespace0.0
96
97 ndctl inject-error --status namespace0.0
98
99 Uninject errors at block 12 for 2 blocks on namespace0.0
100
101 ndctl inject-error --uninject --block=12 --count=2 namespace0.0
102
104 -B, --block=
105 Namespace block offset in 512 byte sized blocks where the error is
106 to be injected.
107
108 NOTE: The offset is interpreted in different ways based on the "mode"
109 of the namespace. For "raw" mode, the offset is the base namespace
110 offset. For "fsdax" mode (i.e. a "pfn" namespace), the offset is
111 relative to the user-visible part of the namespace, and the offset
112 introduced by the kernel's metadata will be accounted for. For a
113 "sector" mode namespace (i.e. a "BTT" namespace), the offset is
114 relative to the base namespace, as the BTT translation details are
115 internal to the kernel, and can't be accounted for while injecting
116 errors.
117
118 -n, --count=
119 Number of blocks to inject as errors. This is also in terms of
120 fixed, 512 byte blocks.
121
122 -d, --uninject
123 This option will ask the platform to remove any injected errors for
124 the specified block offset, and count.
125
126 WARNING: This will not clear the kernel's internal badblock tracking,
127 those can only be cleared by doing a write to the affected locations.
128 Hence use the --clear option only if you know exactly what you are
129 doing. For normal usage, injected errors should only be cleared by
130 doing writes. Do not expect have the original data intact after
131 injecting an error, and clearing it using --clear - it will be lost,
132 as the only "real" way to clear the error location is to write to it
133 or zero it (truncate/hole-punch).
134
135 -t, --status
136 This option will retrieve the status of injected errors. Note that
137 this will not retrieve all known/latent errors (i.e. non injected
138 ones), and is NOT equivalent to performing an Address Range Scrub.
139
140 -N, --no-notify
141 This option is only valid when injecting errors. By default, the
142 error inject command and will ask platform firmware to trigger a
143 notification in the kernel, asking it to update its state of known
144 errors. With this option, the error will still be injected, the
145 kernel will not get a notification, and the error will appear as a
146 latent media error when the location is accessed. If the platform
147 firmware does not support this feature, this will have no effect.
148
149 -S, --saturate
150 This option forces error injection or un-injection to cover the
151 entire address range covered by the specified block(s).
152
153 -v, --verbose
154 Emit debug messages for the error injection process
155
156 -u, --human
157 Format numbers representing storage sizes, or offsets as human
158 readable strings with units instead of the default machine-friendly
159 raw-integer data. Convert other numeric fields into hexadecimal
160 strings.
161
162 -r, --region=
163 A regionX device name, or a region id number. Restrict the
164 operation to the specified region(s). The keyword all can be
165 specified to indicate the lack of any restriction, however this is
166 the same as not supplying a --region option at all.
167
168 -b, --bus=
169 A bus id number, or a provider string (e.g. "ACPI.NFIT"). Restrict
170 the operation to the specified bus(es). The keyword all can be
171 specified to indicate the lack of any restriction, however this is
172 the same as not supplying a --bus option at all.
173
175 Copyright © 2016 - 2022, Intel Corporation. License GPLv2: GNU GPL
176 version 2 http://gnu.org/licenses/gpl.html. This is free software: you
177 are free to change and redistribute it. There is NO WARRANTY, to the
178 extent permitted by law.
179
181 linkndctl:ndctl-list[1],
182
183
184
185 01/13/2023 NDCTL-INJECT-ERROR(1)