1NDCTL-CREATE-NAMESPACE(1)        ndctl Manual        NDCTL-CREATE-NAMESPACE(1)
2
3
4

NAME

6       ndctl-create-namespace - provision or reconfigure a namespace
7

SYNOPSIS

9       ndctl create-namespace [<options>]
10

THEORY OF OPERATION

12       The capacity of an NVDIMM REGION (contiguous span of persistent memory)
13       is accessed via one or more NAMESPACE devices. REGION is the Linux term
14       for what ACPI and UEFI call a DIMM-interleave-set, or a
15       system-physical-address-range that is striped (by the memory
16       controller) across one or more memory modules.
17
18       The UEFI specification defines the NVDIMM Label Protocol as the
19       combination of label area access methods and a data format for
20       provisioning one or more NAMESPACE objects from a REGION. Note that
21       label support is optional and if Linux does not detect the label
22       capability it will automatically instantiate a "label-less" namespace
23       per region. Examples of label-less namespaces are the ones created by
24       the kernel’s memmap=ss!nn command line option (see the nvdimm wiki on
25       kernel.org), or NVDIMMs without a valid namespace index in their label
26       area.
27
28       A namespace can be provisioned to operate in one of 4 modes, fsdax,
29       devdax, sector, and raw. Here are the expected usage models for these
30       modes:
31
32       ·   fsdax: Filesystem-DAX mode is the default mode of a namespace when
33           specifying ndctl create-namespace with no options. It creates a
34           block device (/dev/pmemX[.Y]) that supports the DAX capabilities of
35           Linux filesystems (xfs and ext4 to date). DAX removes the page
36           cache from the I/O path and allows mmap(2) to establish direct
37           mappings to persistent memory media. The DAX capability enables
38           workloads / working-sets that would exceed the capacity of the page
39           cache to scale up to the capacity of persistent memory. Workloads
40           that fit in page cache or perform bulk data transfers may not see
41           benefit from DAX. When in doubt, pick this mode.
42
43       ·   devdax: Device-DAX mode enables similar mmap(2) DAX mapping
44           capabilities as Filesystem-DAX. However, instead of a block-device
45           that can support a DAX-enabled filesystem, this mode emits a single
46           character device file (/dev/daxX.Y). Use this mode to assign
47           persistent memory to a virtual-machine, register persistent memory
48           for RDMA, or when gigantic mappings are needed.
49
50       ·   sector: Use this mode to host legacy filesystems that do not
51           checksum metadata or applications that are not prepared for torn
52           sectors after a crash. Expected usage for this mode is for small
53           boot volumes. This mode is compatible with other operating systems.
54
55       ·   raw: Raw mode is effectively just a memory disk that does not
56           support DAX. Typically this indicates a namespace that was created
57           by tooling or another operating system that did not know how to
58           create a Linux fsdax or devdax mode namespace. This mode is
59           compatible with other operating systems, but again, does not
60           support DAX operation.
61

EXAMPLES

63       Create a maximally sized pmem namespace in fsdax mode (the default)
64
65       ndctl create-namespace
66
67       Convert namespace0.0 to sector mode
68
69       ndctl create-namespace -f -e namespace0.0 --mode=sector
70

OPTIONS

72       -t, --type=
73           Create a pmem or blk namespace (subject to available capacity). A
74           pmem namespace supports the dax (direct access) capability to
75           mmap(2) persistent memory directly into a process address space. A
76           blk namespace access persistent memory through a
77           block-window-aperture. Compared to pmem it supports a traditional
78           storage error model (EIO on error rather than a cpu exception on a
79           bad memory access), but it does not support dax.
80
81       -m, --mode=
82
83           ·   "raw": expose the namespace capacity directly with limitations.
84               Neither a raw pmem namepace nor raw blk namespace support
85               sector atomicity by default (see "sector" mode below). A raw
86               pmem namespace may have limited to no dax support depending the
87               kernel. In other words operations like direct-I/O targeting a
88               dax buffer may fail for a pmem namespace in raw mode or
89               indirect through a page-cache buffer. See "fsdax" and "devdax"
90               mode for dax operation.
91
92           ·   "sector": persistent memory, given that it is byte addressable,
93               does not support sector atomicity. The problematic aspect of
94               sector tearing is that most applications do not know they have
95               a atomic sector update dependency. At least a disk rarely ever
96               tears sectors and if it does it almost certainly returns a
97               checksum error on access. Persistent memory devices will always
98               tear and always silently. Until an application is audited to be
99               robust in the presence of sector-tearing "safe" mode is
100               recommended. This imposes some performance overhead and
101               disables the dax capability. (also known as "safe" or "btt"
102               mode)
103
104           ·   "fsdax": A pmem namespace in this mode supports dax operation
105               with a block-device based filesystem (in previous ndctl
106               releases this mode was named "memory" mode). This mode comes at
107               the cost of allocating per-page metadata. The capacity can be
108               allocated from "System RAM", or from a reserved portion of
109               "Persistent Memory" (see the --map= option). NOTE: A filesystem
110               that supports DAX is required for dax operation. If the raw
111               block device (/dev/pmemX) is used directly without a
112               filesystem, it will use the page cache. See "devdax" mode for
113               raw device access that supports dax.
114
115           ·   "devdax": The device-dax character device interface is a
116               statically allocated / raw access analogue of filesystem-dax
117               (in previous ndctl releases this mode was named "dax" mode). It
118               allows memory ranges to be mapped without need of an
119               intervening filesystem. The device-dax is interface strict,
120               precise and predictable. Specifically the interface:
121
122               ·   Guarantees fault granularity with respect to a given page
123                   size (4K, 2M, or 1G on x86) set at configuration time.
124
125               ·   Enforces deterministic behavior by being strict about what
126                   fault scenarios are supported. I.e. if a device is
127                   configured with a 2M alignment an attempt to fault a 4K
128                   aligned offset will result in SIGBUS.
129
130       -s, --size=
131           For NVDIMM devices that support namespace labels, set the namespace
132           size in bytes. Otherwise it defaults to the maximum size specified
133           by platform firmware. This option supports the suffixes "k" or "K"
134           for KiB, "m" or "M" for MiB, "g" or "G" for GiB and "t" or "T" for
135           TiB.
136
137               For pmem namepsaces the size must be a multiple of the
138               interleave-width and the namespace alignment (see
139               below).
140
141       -a, --align
142           Applications that want to establish dax memory mappings with page
143           table entries greater than system base page size (4K on x86) need a
144           persistent memory namespace that is sufficiently aligned. For
145           "fsdax" and "devdax" mode this defaults to 2M. Note that "devdax"
146           mode enforces all mappings to be aligned to this value, i.e. it
147           fails unaligned mapping attempts. The "fsdax" alignment setting
148           determines the starting alignment of filesystem extents and may
149           limit the possible granularities, if a large mapping is not
150           possible it will silently fall back to a smaller page size.
151
152       -e, --reconfig=
153           Reconfigure an existing namespace (change the mode, sector size,
154           etc...). All namespace parameters, save uuid, default to the
155           current attributes of the specified namespace. The namespace is
156           then re-created with the specified modifications. The uuid is
157           refreshed to a new value by default whenever the data layout of a
158           namespace is changed, see --uuid= to set a specific uuid.
159
160       -u, --uuid=
161           This option is not recommended as a new uuid should be generated
162           every time a namespace is (re-)created. For recovery scenarios
163           however the uuid may be specified.
164
165       -n, --name=
166           For NVDIMM devices that support namespace labels, specify a human
167           friendly name for a namespace. This name is available as a device
168           attribute for use in udev rules.
169
170       -l, --sector-size
171           Specify the logical sector size (LBA size) of the Linux block
172           device associated with an namespace.
173
174       -M, --map=
175           A pmem namespace in "fsdax" or "devdax" mode requires allocation of
176           per-page metadata. The allocation can be drawn from either:
177
178           ·   "mem": typical system memory
179
180           ·   "dev": persistent memory reserved from the namespace
181
182                   Given relative capacities of "Persistent Memory" to "System
183                   RAM" the allocation defaults to reserving space out of the
184                   namespace directly ("--map=dev"). The overhead is 64-bytes per
185                   4K (16GB per 1TB) on x86.
186
187       -f, --force
188           Unless this option is specified the reconfigure namespace operation
189           will fail if the namespace is presently active. Specifying --force
190           causes the namespace to be disabled before the operation is
191           attempted. However, if the namespace is mounted then the disable
192           namespace and reconfigure namespace operations will be aborted. The
193           namespace must be unmounted before being reconfigured.
194
195       -L, --autolabel, --no-autolabel
196           Legacy NVDIMM devices do not support namespace labels. In that case
197           the kernel creates region-sized namespaces that can not be deleted.
198           Their mode can be changed, but they can not be resized smaller than
199           their parent region. This is termed a "label-less namespace". In
200           contrast, NVDIMMs and hypervisors that support the ACPI 6.2 label
201           area definition (ACPI 6.2 Section 6.5.10 NVDIMM Label Methods)
202           support "labelled namespace" operation.
203
204           ·   There are two cases where the kernel will default to label-less
205               operation:
206
207               ·   NVDIMM does not support labels
208
209               ·   The NVDIMM supports labels, but the Label Index Block (see
210                   UEFI 2.7) is not present and there is no capacity aliasing
211                   between blk and pmem regions.
212
213           ·   In the latter case the configuration can be upgraded to
214               labelled operation by writing an index block on all DIMMs in a
215               region and re-enabling that region. The autolabel capability of
216               ndctl create-namespace --reconfig tries to do this by default
217               if it can determine that all DIMM capacity is referenced by the
218               namespace being reconfigured. It will otherwise fail to
219               autolabel and remain in label-less mode if it finds a DIMM
220               contributes capacity to more than one region. This check
221               prevents inadvertent data loss of that other region is in
222               active use. The --autolabel option is implied by default, the
223               --no-autolabel option can be used to disable this behavior.
224               When automatic labeling fails and labelled operation is still
225               desired the safety policy can be bypassed by the following
226               commands, note that all data on all regions is forfeited by
227               running these commands:
228
229                   ndctl disable-region all
230                   ndctl init-labels all
231                   ndctl enable-region all
232
233       -v, --verbose
234           Emit debug messages for the namespace creation process
235
236       -r, --region=
237
238               A 'regionX' device name, or a region id number. The keyword 'all' can
239               be specified to carry out the operation on every region in the system,
240               optionally filtered by bus id (see --bus= option).
241
242       -b, --bus=
243           Enforce that the operation only be carried on devices that are
244           attached to the given bus. Where bus can be a provider name or a
245           bus id number.
246
248       Copyright (c) 2016 - 2019, Intel Corporation. License GPLv2: GNU GPL
249       version 2 <http://gnu.org/licenses/gpl.html>. This  is  free  software:
250       you  are  free  to change and redistribute it. There is NO WARRANTY, to
251       the extent permitted by law.
252

SEE ALSO

254       ndctl-zero-labels(1), ndctl-init-labels(1), ndctl-disable-namespace(1),
255       ndctl-enable-namespace(1), UEFI NVDIMM Label Protocol <http://
256       www.uefi.org/sites/default/files/resources/UEFI_Spec_2_7.pdf> Linux
257       Persistent Memory Wiki <https://nvdimm.wiki.kernel.org>
258
259
260
261ndctl                             2019-05-10         NDCTL-CREATE-NAMESPACE(1)
Impressum