ndctl-inject-error(1)

1NDCTL-INJECT-ERROR(1)            ndctl Manual            NDCTL-INJECT-ERROR(1)
2
3
4

NAME

6       ndctl-inject-error - inject media errors at a namespace offset
7

SYNOPSIS

9       ndctl inject-error <namespace> [<options>]
10

THEORY OF OPERATION

12       The capacity of an NVDIMM REGION (contiguous span of persistent memory)
13       is accessed via one or more NAMESPACE devices. REGION is the Linux term
14       for what ACPI and UEFI call a DIMM-interleave-set, or a
15       system-physical-address-range that is striped (by the memory
16       controller) across one or more memory modules.
17
18       The UEFI specification defines the NVDIMM Label Protocol as the
19       combination of label area access methods and a data format for
20       provisioning one or more NAMESPACE objects from a REGION. Note that
21       label support is optional and if Linux does not detect the label
22       capability it will automatically instantiate a "label-less" namespace
23       per region. Examples of label-less namespaces are the ones created by
24       the kernel’s memmap=ss!nn command line option (see the nvdimm wiki on
25       kernel.org), or NVDIMMs without a valid namespace index in their label
26       area.
27
28           Note
29           Label-less namespaces lack many of the features of their label-rich
30           cousins. For example, their size cannot be modified, or they cannot
31           be fully destroyed (i.e. the space reclaimed). A destroy operation
32           will zero any mode-specific metadata. Finally, for create-namespace
33           operations on label-less namespaces, ndctl bypasses the region
34           capacity availability checks, and always satisfies the request
35           using the full region capacity. The only reconfiguration operation
36           supported on a label-less namespace is changing its mode.
37
38       A namespace can be provisioned to operate in one of 4 modes, fsdax,
39       devdax, sector, and raw. Here are the expected usage models for these
40       modes:
41
42       ·   fsdax: Filesystem-DAX mode is the default mode of a namespace when
43           specifying ndctl create-namespace with no options. It creates a
44           block device (/dev/pmemX[.Y]) that supports the DAX capabilities of
45           Linux filesystems (xfs and ext4 to date). DAX removes the page
46           cache from the I/O path and allows mmap(2) to establish direct
47           mappings to persistent memory media. The DAX capability enables
48           workloads / working-sets that would exceed the capacity of the page
49           cache to scale up to the capacity of persistent memory. Workloads
50           that fit in page cache or perform bulk data transfers may not see
51           benefit from DAX. When in doubt, pick this mode.
52
53       ·   devdax: Device-DAX mode enables similar mmap(2) DAX mapping
54           capabilities as Filesystem-DAX. However, instead of a block-device
55           that can support a DAX-enabled filesystem, this mode emits a single
56           character device file (/dev/daxX.Y). Use this mode to assign
57           persistent memory to a virtual-machine, register persistent memory
58           for RDMA, or when gigantic mappings are needed.
59
60       ·   sector: Use this mode to host legacy filesystems that do not
61           checksum metadata or applications that are not prepared for torn
62           sectors after a crash. Expected usage for this mode is for small
63           boot volumes. This mode is compatible with other operating systems.
64
65       ·   raw: Raw mode is effectively just a memory disk that does not
66           support DAX. Typically this indicates a namespace that was created
67           by tooling or another operating system that did not know how to
68           create a Linux fsdax or devdax mode namespace. This mode is
69           compatible with other operating systems, but again, does not
70           support DAX operation.
71
72       ndctl-inject-error can be used to ask the platform to simulate media
73       errors in the NVDIMM address space to aid debugging and development of
74       features related to error handling.
75
76       By default, injecting an error actually only injects an error to the
77       first n bytes of the block, where n is the output of
78       ndctl_cmd_ars_cap_get_size(). In other words, we only inject one
79       ars_unit per sector. This is sufficient for Linux to mark the whole
80       sector as bad, and will show up as such in the various badblocks lists
81       in the kernel. If multiple blocks are being injected, only the first n
82       bytes of each block specified will be injected as errors. This can be
83       overridden by the --saturate option, which will force the entire block
84       to be injected as an error.
85
86           Warning
87           These commands are DANGEROUS and can cause data loss. They are only
88           provided for testing and debugging purposes.
89

EXAMPLES

91       Inject errors in namespace0.0 at block 12 for 2 blocks (i.e. 12, 13)
92
93
94       ndctl inject-error --block=12 --count=2 namespace0.0
95
96       Check status of injected errors on namespace0.0
97
98
99       ndctl inject-error --status namespace0.0
100
101       Uninject errors at block 12 for 2 blocks on namespace0.0
102
103
104       ndctl inject-error --uninject --block=12 --count=2 namespace0.0
105

OPTIONS

107       -B, --block=
108           Namespace block offset in 512 byte sized blocks where the error is
109           to be injected.
110
111               NOTE: The offset is interpreted in different ways based on the "mode"
112               of the namespace. For "raw" mode, the offset is the base namespace
113               offset. For "fsdax" mode (i.e. a "pfn" namespace), the offset is
114               relative to the user-visible part of the namespace, and the offset
115               introduced by the kernel's metadata will be accounted for. For a
116               "sector" mode namespace (i.e. a "BTT" namespace), the offset is
117               relative to the base namespace, as the BTT translation details are
118               internal to the kernel, and can't be accounted for while injecting
119               errors.
120
121       -n, --count=
122           Number of blocks to inject as errors. This is also in terms of
123           fixed, 512 byte blocks.
124
125       -d, --uninject
126           This option will ask the platform to remove any injected errors for
127           the specified block offset, and count.
128
129               WARNING: This will not clear the kernel's internal badblock tracking,
130               those can only be cleared by doing a write to the affected locations.
131               Hence use the --clear option only if you know exactly what you are
132               doing. For normal usage, injected errors should only be cleared by
133               doing writes. Do not expect have the original data intact after
134               injecting an error, and clearing it using --clear - it will be lost,
135               as the only "real" way to clear the error location is to write to it
136               or zero it (truncate/hole-punch).
137
138       -t, --status
139           This option will retrieve the status of injected errors. Note that
140           this will not retrieve all known/latent errors (i.e. non injected
141           ones), and is NOT equivalent to performing an Address Range Scrub.
142
143       -N, --no-notify
144           This option is only valid when injecting errors. By default, the
145           error inject command and will ask platform firmware to trigger a
146           notification in the kernel, asking it to update its state of known
147           errors. With this option, the error will still be injected, the
148           kernel will not get a notification, and the error will appear as a
149           latent media error when the location is accessed. If the platform
150           firmware does not support this feature, this will have no effect.
151
152       -S, --saturate
153           This option forces error injection or un-injection to cover the
154           entire address range covered by the specified block(s).
155
156       -v, --verbose
157           Emit debug messages for the error injection process
158
159       -u, --human
160           Format numbers representing storage sizes, or offsets as human
161           readable strings with units instead of the default machine-friendly
162           raw-integer data. Convert other numeric fields into hexadecimal
163           strings.
164
165       -r, --region=
166           A regionX device name, or a region id number. Restrict the
167           operation to the specified region(s). The keyword all can be
168           specified to indicate the lack of any restriction, however this is
169           the same as not supplying a --region option at all.
170
171       -b, --bus=
172           A bus id number, or a provider string (e.g. "ACPI.NFIT"). Restrict
173           the operation to the specified bus(es). The keyword all can be
174           specified to indicate the lack of any restriction, however this is
175           the same as not supplying a --bus option at all.
176

COPYRIGHT

178       Copyright (c) 2016 - 2019, Intel Corporation. License GPLv2: GNU GPL
179       version 2 http://gnu.org/licenses/gpl.html. This is free software: you
180       are free to change and redistribute it. There is NO WARRANTY, to the
181       extent permitted by law.
182

NAME

SYNOPSIS

THEORY OF OPERATION

EXAMPLES

OPTIONS

COPYRIGHT

SEE ALSO