1POOLSET(5) PMDK Programmer's Manual POOLSET(5)
2
3
4
6 poolset - persistent memory pool configuration file format
7
9 mypool.set
10
12 Depending on the configuration of the system, the available non-
13 volatile memory space may be divided into multiple memory devices. In
14 such case, the maximum size of the transactional object store could be
15 limited by the capacity of a single memory device. Therefore, libpmem‐
16 obj(7), libpmemblk(7) and libpmemlog(7) allow building object stores
17 spanning multiple memory devices by creation of persistent memory pools
18 consisting of multiple files, where each part of such a pool set may be
19 stored on a different pmem-aware filesystem.
20
21 To improve reliability and eliminate single point of failure, libpmemo‐
22 bj(7) also allows all the data written to a persistent memory pool to
23 be copied to local or remote pool replicas, thereby providing backup
24 for the persistent memory pool by producing a mirrored pool set. In
25 practice, the pool replicas may be considered as binary copies of the
26 “master” pool set. Data replication is not supported in libpmemblk(7)
27 and libpmemlog(7).
28
29 The set file for each type of pool is a plain text file. Lines in the
30 file are formatted as follows:
31
32 · The first line of the file must be the literal string “PMEMPOOLSET”
33
34 · The pool parts are specified, one per line, in the format:
35
36 size pathname
37
38 · Replica sections, if any, start with the literal string “REPLICA”.
39 See REPLICAS, below, for further details.
40
41 · Pool set options, if any, start with literal string OPTION. See POOL
42 SET OPTIONS below for details.
43
44 · Lines starting with “#” are considered comments and are ignored.
45
46 The size must be compliant with the format specified in IEC 80000-13,
47 IEEE 1541 or the Metric Interchange Format. These standards accept SI
48 units with obligatory B - kB, MB, GB, ... (multiplier by 1000) suffix‐
49 es, and IEC units with optional “iB” - KiB, MiB, GiB, ..., K, M, G, ...
50 - (multiplier by 1024) suffixes.
51
52 pathname must be an absolute pathname.
53
54 The pathname of a part can point to a Device DAX. Device DAX is the
55 device-centric analogue of Filesystem DAX. It allows memory ranges to
56 be allocated and mapped without need of an intervening file system.
57
58 Pools created on Device DAX have additional options and restrictions:
59
60 · The size may be set to “AUTO”, in which case the size of the device
61 will be automatically resolved at pool creation time.
62
63 · To concatenate more than one Device DAX device into a single pool
64 set, the configured internal alignment of the devices must be 4KiB,
65 unless the SINGLEHDR or NOHDRS option is used in the pool set file.
66 See POOL SET OPTIONS below for details.
67
68 Please see ndctl-create-namespace(1) for more information on Device
69 DAX, including how to configure desired alignment.
70
71 The minimum file size of each part of the pool set is defined as fol‐
72 lows:
73
74 · For block pools, as PMEMBLK_MIN_PART in <libpmemblk.h>
75
76 · For object pools, as PMEMOBJ_MIN_PART in <libpmemobj.h>
77
78 · For log pools, as PMEMLOG_MIN_PART in <libpmemlog.h>
79
80 The net pool size of the pool set is equal to:
81
82 net_pool_size = sum_over_all_parts(page_aligned_part_size - 4KiB) + 4KiB
83
84 where
85
86 page_aligned_part_size = part_size & ~(page_size - 1)
87
88 Note that page size is OS specific. For more information please see
89 sysconf(3).
90
91 The minimum net pool size of a pool set is defined as follows:
92
93 · For block pools, as PMEMBLK_MIN_POOL in <libpmemblk.h>
94
95 · For object pools, as PMEMOBJ_MIN_POOL in <libpmemobj.h>
96
97 · For log pools, as PMEMLOG_MIN_POOL in <libpmemlog.h>
98
99 Here is an example “mypool.set” file:
100
101 PMEMPOOLSET
102 OPTION NOHDRS
103 100G /mountpoint0/myfile.part0
104 200G /mountpoint1/myfile.part1
105 400G /mountpoint2/myfile.part2
106
107 The files in the set may be created by running one of the following
108 commands. To create a block pool:
109
110 $ pmempool create blk <bsize> mypool.set
111
112 To create a log pool:
113
114 $ pmempool create log mypool.set
115
117 Sections defining replica sets are optional. There may be multiple
118 replica sections.
119
120 Local replica sections begin with a line containing only the literal
121 string “REPLICA”, followed by one or more pool part lines as described
122 above.
123
124 Remote replica sections consist of the REPLICA keyword, followed on the
125 same line by the address of a remote host and a relative path to a re‐
126 mote pool set file:
127
128 REPLICA [<user>@]<hostname> [<relative-path>/]<remote-pool-set-file>
129
130 · hostname must be in the format recognized by the ssh(1) remote login
131 client
132
133 · pathname is relative to the root config directory on the target node
134 - see librpmem(7)
135
136 There are no other lines in the remote replica section - the REPLICA
137 line defines a remote replica entirely.
138
139 Here is an example “myobjpool.set” file with replicas:
140
141 PMEMPOOLSET
142 100G /mountpoint0/myfile.part0
143 200G /mountpoint1/myfile.part1
144 400G /mountpoint2/myfile.part2
145
146 # local replica
147 REPLICA
148 500G /mountpoint3/mymirror.part0
149 200G /mountpoint4/mymirror.part1
150
151 # remote replica
152 REPLICA user@example.com remote-objpool.set
153
154 The files in the object pool set may be created by running the follow‐
155 ing command:
156
157 $ pmempool create --layout="mylayout" obj myobjpool.set
158
159 Remote replica cannot have replicas, i.e. a remote pool set file cannot
160 define any replicas.
161
163 Pool set options can appear anywhere after the line with PMEMPOOLSET
164 string. Pool set file can contain several pool set options. The fol‐
165 lowing options are supported:
166
167 · SINGLEHDR
168
169 · NOHDRS
170
171 If the SINGLEHDR option is used, only the first part in each replica
172 contains the pool part internal metadata. In that case the effective
173 size of a replica is the sum of sizes of all its part files decreased
174 once by 4096 bytes.
175
176 The NOHDRS option can appear only in the remote pool set file, when li‐
177 brpmem does not serve as a means of replication for libpmemobj pool.
178 In that case none of the pool parts contains internal metadata. The
179 effective size of such a replica is the sum of sizes of all its part
180 files.
181
182 Options SINGLEHDR and NOHDRS are mutually exclusive. If both are spec‐
183 ified in a pool set file, creating or opening the pool will fail with
184 an error.
185
186 When using the SINGLEHDR or NOHDRS option, one can concatenate more
187 than one Device DAX devices with any internal alignments in one repli‐
188 ca.
189
190 The SINGLEHDR option concerns only replicas that are local to the pool
191 set file. That is if one wants to create a pool set with the SINGLEHDR
192 option and with remote replicas, one has to add this option to the lo‐
193 cal pool set file as well as to every single remote pool set file.
194
195 Using the SINGLEHDR and NOHDRS options has important implications for
196 data integrity checking and recoverability in case of a pool set dam‐
197 age. See pmempool_sync() API for more information about pool set re‐
198 covery.
199
201 Providing a directory as a part’s pathname allows the pool to dynami‐
202 cally create files and consequently removes the user-imposed limit on
203 the size of the pool.
204
205 The size argument of a part in a directory poolset becomes the size of
206 the address space reservation required for the pool. In other words,
207 the size argument is the maximum theoretical size of the mapping. This
208 value can be freely increased between instances of the application, but
209 decreasing it below the real required space will result in an error
210 when attempting to open the pool.
211
212 The directory must NOT contain user created files with extension .pmem,
213 otherwise the behavior is undefined. If a file created by the library
214 within the directory is in any way altered (resized, renamed) the be‐
215 havior is undefined.
216
217 A directory poolset must exclusively use directories to specify paths -
218 combining files and directories will result in an error. A single
219 replica can consist of one or more directories. If there are multiple
220 directories, the address space reservation is equal to the sum of the
221 sizes.
222
223 The order in which the files are created is unspecified, but the li‐
224 brary will try to maintain equal usage of the directories.
225
226 By default pools grow in 128 megabyte increments.
227
228 Only poolsets with the SINGLEHDR option can safely use directories.
229
231 Creation of all the parts of the pool set and the associated replica
232 sets can be done with the pmemobj_create(3), pmemblk_create(3) or pmem‐
233 log_create(3) function, or by using the pmempool(1) utility.
234
235 Restoring data from a local or remote replica can be done by using the
236 pmempool-sync(1) command or the pmempool_sync() API from the libpmem‐
237 pool(7) library.
238
239 Modifications of a pool set file configuration can be done by using the
240 pmempool-transform(1) command or the pmempool_transform() API from the
241 libpmempool(7) library.
242
243 When creating a pool set consisting of multiple files, or when creating
244 a replicated pool set, the path argument passed to pmemobj_create(3),
245 pmemblk_create(3) or pmemlog_create(3) must point to the special set
246 file that defines the pool layout and the location of all the parts of
247 the pool set.
248
249 When opening a pool set consisting of multiple files, or when opening a
250 replicated pool set, the path argument passed to pmemobj_open(3), pmem‐
251 blk_open(3) or pmemlog_open(3) must point to the same set file that was
252 used for pool set creation.
253
255 ndctl-create-namespace(1), pmemblk_create(3), pmemlog_create(3), pmemo‐
256 bj_create(3), sysconf(3), libpmemblk(7), libpmemlog(7), libpmemobj(7)
257 and <https://pmem.io>
258
259
260
261PMDK - poolset API version 1.0 2020-07-03 POOLSET(5)