ndctl-inject-error(1)

1NDCTL-INJECT-ERROR(1)            ndctl Manual            NDCTL-INJECT-ERROR(1)
2
3
4

NAME

6       ndctl-inject-error - inject media errors at a namespace offset
7

SYNOPSIS

9       ndctl inject-error <namespace> [<options>]
10

THEORY OF OPERATION

12       The capacity of an NVDIMM REGION (contiguous span of persistent memory)
13       is accessed via one or more NAMESPACE devices. REGION is the Linux term
14       for what ACPI and UEFI call a DIMM-interleave-set, or a
15       system-physical-address-range that is striped (by the memory
16       controller) across one or more memory modules.
17
18       The UEFI specification defines the NVDIMM Label Protocol as the
19       combination of label area access methods and a data format for
20       provisioning one or more NAMESPACE objects from a REGION. Note that
21       label support is optional and if Linux does not detect the label
22       capability it will automatically instantiate a "label-less" namespace
23       per region. Examples of label-less namespaces are the ones created by
24       the kernel’s memmap=ss!nn command line option (see the nvdimm wiki on
25       kernel.org), or NVDIMMs without a valid namespace index in their label
26       area.
27
28       A namespace can be provisioned to operate in one of 4 modes, fsdax,
29       devdax, sector, and raw. Here are the expected usage models for these
30       modes:
31
32       ·   fsdax: Filesystem-DAX mode is the default mode of a namespace when
33           specifying ndctl create-namespace with no options. It creates a
34           block device (/dev/pmemX[.Y]) that supports the DAX capabilities of
35           Linux filesystems (xfs and ext4 to date). DAX removes the page
36           cache from the I/O path and allows mmap(2) to establish direct
37           mappings to persistent memory media. The DAX capability enables
38           workloads / working-sets that would exceed the capacity of the page
39           cache to scale up to the capacity of persistent memory. Workloads
40           that fit in page cache or perform bulk data transfers may not see
41           benefit from DAX. When in doubt, pick this mode.
42
43       ·   devdax: Device-DAX mode enables similar mmap(2) DAX mapping
44           capabilities as Filesystem-DAX. However, instead of a block-device
45           that can support a DAX-enabled filesystem, this mode emits a single
46           character device file (/dev/daxX.Y). Use this mode to assign
47           persistent memory to a virtual-machine, register persistent memory
48           for RDMA, or when gigantic mappings are needed.
49
50       ·   sector: Use this mode to host legacy filesystems that do not
51           checksum metadata or applications that are not prepared for torn
52           sectors after a crash. Expected usage for this mode is for small
53           boot volumes. This mode is compatible with other operating systems.
54
55       ·   raw: Raw mode is effectively just a memory disk that does not
56           support DAX. Typically this indicates a namespace that was created
57           by tooling or another operating system that did not know how to
58           create a Linux fsdax or devdax mode namespace. This mode is
59           compatible with other operating systems, but again, does not
60           support DAX operation.
61
62       ndctl-inject-error can be used to ask the platform to simulate media
63       errors in the NVDIMM address space to aid debugging and development of
64       features related to error handling.
65
66       By default, injecting an error actually only injects an error to the
67       first n bytes of the block, where n is the output of
68       ndctl_cmd_ars_cap_get_size(). In other words, we only inject one
69       ars_unit per sector. This is sufficient for Linux to mark the whole
70       sector as bad, and will show up as such in the various badblocks lists
71       in the kernel. If multiple blocks are being injected, only the first n
72       bytes of each block specified will be injected as errors. This can be
73       overridden by the --saturate option, which will force the entire block
74       to be injected as an error.
75
76           Warning
77           These commands are DANGEROUS and can cause data loss. They are only
78           provided for testing and debugging purposes.
79

EXAMPLES

81       Inject errors in namespace0.0 at block 12 for 2 blocks (i.e. 12, 13)
82
83           ndctl inject-error --block=12 --count=2 namespace0.0
84
85       Check status of injected errors on namespace0.0
86
87           ndctl inject-error --status namespace0.0
88
89       Uninject errors at block 12 for 2 blocks on namespace0.0
90
91           ndctl inject-error --uninject --block=12 --count=2 namespace0.0
92

OPTIONS

94       -B, --block=
95           Namespace block offset in 512 byte sized blocks where the error is
96           to be injected.
97
98               NOTE: The offset is interpreted in different ways based on the "mode"
99               of the namespace. For "raw" mode, the offset is the base namespace
100               offset. For "fsdax" mode (i.e. a "pfn" namespace), the offset is
101               relative to the user-visible part of the namespace, and the offset
102               introduced by the kernel's metadata will be accounted for. For a
103               "sector" mode namespace (i.e. a "BTT" namespace), the offset is
104               relative to the base namespace, as the BTT translation details are
105               internal to the kernel, and can't be accounted for while injecting
106               errors.
107
108       -n, --count=
109           Number of blocks to inject as errors. This is also in terms of
110           fixed, 512 byte blocks.
111
112       -d, --uninject
113           This option will ask the platform to remove any injected errors for
114           the specified block offset, and count.
115
116               WARNING: This will not clear the kernel's internal badblock tracking,
117               those can only be cleared by doing a write to the affected locations.
118               Hence use the --clear option only if you know exactly what you are
119               doing. For normal usage, injected errors should only be cleared by
120               doing writes. Do not expect have the original data intact after
121               injecting an error, and clearing it using --clear - it will be lost,
122               as the only "real" way to clear the error location is to write to it
123               or zero it (truncate/hole-punch).
124
125       -t, --status
126           This option will retrieve the status of injected errors. Note that
127           this will not retrieve all known/latent errors (i.e. non injected
128           ones), and is NOT equivalent to performing an Address Range Scrub.
129
130       -N, --no-notify
131           This option is only valid when injecting errors. By default, the
132           error inject command and will ask platform firmware to trigger a
133           notification in the kernel, asking it to update its state of known
134           errors. With this option, the error will still be injected, the
135           kernel will not get a notification, and the error will appear as a
136           latent media error when the location is accessed. If the platform
137           firmware does not support this feature, this will have no effect.
138
139       -S, --saturate
140           This option forces error injection or un-injection to cover the
141           entire address range covered by the specified block(s).
142
143       -v, --verbose
144           Emit debug messages for the error injection process
145
146       -u, --human
147           Format numbers representing storage sizes, or offsets as human
148           readable strings with units instead of the default machine-friendly
149           raw-integer data. Convert other numeric fields into hexadecimal
150           strings.
151
152       -r, --region=
153
154
155               A 'regionX' device name, or a region id number. The keyword 'all' can
156               be specified to carry out the operation on every region in the system,
157               optionally filtered by bus id (see --bus= option).
158
159       -b, --bus=
160           Enforce that the operation only be carried on devices that are
161           attached to the given bus. Where bus can be a provider name or a
162           bus id number.
163

COPYRIGHT

165       Copyright (c) 2016 - 2018, Intel Corporation. License GPLv2: GNU GPL
166       version 2 http://gnu.org/licenses/gpl.html. This is free software: you
167       are free to change and redistribute it. There is NO WARRANTY, to the
168       extent permitted by law.
169

NAME

SYNOPSIS

THEORY OF OPERATION

EXAMPLES

OPTIONS

COPYRIGHT

SEE ALSO