1LIBPMEM2(7) PMDK Programmer's Manual LIBPMEM2(7)
2
3
4
6 libpmem2 - persistent memory support library
7
9 #include <libpmem2.h>
10 cc ... -lpmem2
11
13 libpmem2 provides low-level persistent memory (pmem) support for appli‐
14 cations using direct access storage (DAX), which is storage that sup‐
15 ports load/store access without paging blocks from a block storage de‐
16 vice. Some types of non-volatile memory DIMMs (NVDIMMs) provide this
17 type of byte addressable access to storage. A persistent memory aware
18 file system is typically used to expose the direct access to applica‐
19 tions. Memory mapping a file from this type of file system results in
20 the load/store, non-paged access to pmem.
21
22 This library is for applications that use persistent memory directly,
23 without the help of any library-supplied transactions or memory alloca‐
24 tion. Higher-level libraries that currently build on libpmem (previous
25 variation of libpmem2) are available and are recommended for most ap‐
26 plications, see:
27
28 • libpmemobj(7), a general use persistent memory API, providing memory
29 allocation and transactional operations on variable-sized objects.
30
31 • libpmemblk(7), providing pmem-resident arrays of fixed-sized blocks
32 with atomic updates.
33
34 • libpmemlog(7), providing a pmem-resident log file.
35
36 The libpmem2 library provides a comprehensive set of functions for ro‐
37 bust use of Persistent Memory. It relies on three core concepts:
38 struct pmem2_src source, struct pmem2_config config and struct
39 pmem2_map map:
40
41 • source - an object describing the data source for mapping. The data
42 source can be a file descriptor, a file handle, or an anonymous map‐
43 ping. APIs dedicated for creating source are:
44 pmem2_source_from_fd(3), pmem2_source_from_handle(3),
45 pmem2_source_from_anon(3).
46
47 • config - an object containing parameters that are used to create a
48 mapping from a source. The configuration structure must always be
49 provided to create a mapping, but the only required parameter to set
50 in the config is granularity. The granularity should by set using
51 dedicated libpmem2 function pmem2_config_set_required_store_granular‐
52 ity(3) which defines a maximum permitted granularity requested by the
53 user. For more information about the granularity concept read GRANU‐
54 LARITY section below.
55
56 In addition to the granularity setting, libpmem2 provides multiple op‐
57 tional functions to configure target mapping, e.g., pmem2_con‐
58 fig_set_length(3) to set length which will be used for mapping, or
59 pmem2_config_set_offset(3) which will be used to map the contents from
60 the specified location of the source, pmem2_config_set_sharing(3) which
61 defines the behavior and visibility of writes to the mapping's pages.
62
63 • map - an object created by pmem2_map_new(3) using source and config
64 as an input parameters. The map structure can be then used to di‐
65 rectly operate on the created mapping through the use of its associ‐
66 ated set of functions: pmem2_map_get_address(3),
67 pmem2_map_get_size(3), pmem2_map_get_store_granularity(3) - for get‐
68 ting address, size and effective mapping granularity.
69
70 In addition to the basic functionality of managing the virtual address
71 mapping, libpmem2 also provides optimized functions for modifying the
72 mapped data. This includes data flushing as well as memory copying.
73
74 To get proper function for data flushing use: pmem2_get_flush_fn(3),
75 pmem2_get_persist_fn(3) or pmem2_get_drain_fn(3). To get proper func‐
76 tion for copying to persistent memory, use map getters: pmem2_get_mem‐
77 cpy_fn(3), pmem2_get_memset_fn(3), pmem2_get_memmove_fn(3).
78
79 The libpmem2 API also provides support for the badblock and unsafe
80 shutdown state handling.
81
82 To read or clear badblocks, the following functions are provided:
83 pmem2_badblock_context_new(3), pmem2_badblock_context_delete(3),
84 pmem2_badblock_next(3) and pmem2_badblock_clear(3).
85
86 To handle unsafe shutdown in the application, the following functions
87 are provided: pmem2_source_device_id(3), pmem2_source_device_usc(3).
88 More detailed information about unsafe shutdown detection and unsafe
89 shutdown count can be found in the libpmem2_unsafe_shutdown(7) man
90 page.
91
93 The libpmem2 library introduces the concept of granularity through
94 which you may easily distinguish between different levels of storage
95 performance capabilities available to the application as related to
96 power-fail protected domain. The way data reaches this protected do‐
97 main differs based on the platform and storage device capabilities.
98
99 Traditional block storage devices (SSD, HDD) must use system API calls
100 such as msync(), fsync() on Linux, or FlushFileBuffers(),FlushViewOf‐
101 File() on Windows to write data reliably. Invoking these functions
102 flushes the data to the medium with page granularity. In the libpmem2
103 library, this type of flushing behavior is called PMEM2_GRANULARI‐
104 TY_PAGE.
105
106 In systems with persistent memory support, a power-fail protected do‐
107 main may cover different sets of resources: either the memory con‐
108 troller or the memory controller and CPU caches. For this reason,
109 libpmem2 distinguishes two types of granularity for persistent memory:
110 PMEM2_GRANULARITY_CACHE_LINE and PMEM2_GRANULARITY_BYTE.
111
112 If the power-fail protected domain covers only the memory controller,
113 the CPU appropriate cache lines must be flushed for the data to be con‐
114 sidered persistent. This granularity type is called PMEM2_GRANULARI‐
115 TY_CACHE_LINE. Depending on the architecture, there are different
116 types of machine instructions for flushing cache lines (e.g., CLWB,
117 CLFLUSHOPT, CLFLUSH for Intel x86_64 architecture). Usually, to ensure
118 the ordering of stores, such instructions must be followed by a barrier
119 (e.g., SFENCE).
120
121 The third type of granularity PMEM2_GRANULARITY_BYTE applies to plat‐
122 forms where power-fail protected domain covers both the memory con‐
123 troller and CPU caches. In such cases, cache flush instructions are no
124 longer needed, and the platform itself guarantees the persistence of
125 data. But barriers might still be required for ordering.
126
127 The library declares these granularity level in pmem2_granularity enum,
128 which the application must set in pmem2_config to the appropriate level
129 for a mapping to succeed. The software should set this config parame‐
130 ter to a value that most accurately represents the target hardware
131 characteristics and the storage patterns of the application. For exam‐
132 ple, a database storage engine that operates on large logical pages
133 that reside either on SSDs or PMEM should set this value to PMEM2_GRAN‐
134 ULARITY_PAGE. The library will create mappings where the new map gran‐
135 ularity is lower or equal to the requested one. For example, a mapping
136 with PMEM2_GRANULARITY_CACHE_LINE can be created for the required gran‐
137 ularity PMEM2_GRANULARITY_PAGE, but not vice versa.
138
140 libpmem2 relies on the library destructor being called from the main
141 thread. For this reason, all functions that might trigger destruction
142 (e.g. dlclose(3)) should be called in the main thread. Otherwise some
143 of the resources associated with that thread might not be cleaned up
144 properly.
145
147 libpmem2 can change its default behavior based on the following envi‐
148 ronment variables. These are primarily intended for testing and are
149 generally not required.
150
151 • PMEM2_FORCE_GRANULARITY=val
152
153 Setting this environment variable to val forces libpmem2 to use persist
154 method specific for forced granularity and skip granularity autodetect‐
155 ing mechanism. The concept of the granularity is described in GRANU‐
156 LARITY section above. This variable is intended for use during library
157 testing.
158
159 The val argument accepts following text values:
160
161 • BYTE - force byte granularity.
162
163 • CACHE_LINE - force cache line granularity.
164
165 • PAGE - force page granularity.
166
167 Granularity values listed above are case-insensitive.
168
169 NOTE: The value of PMEM2_FORCE_GRANULARITY is not queried (and
170 cached) at library initialization time, but read during each
171 pmem2_map_new(3) call.
172
173 This means that PMEM2_FORCE_GRANULARITY may still be set or modified by
174 the program until the first attempt to map a file.
175
176 • PMEM_NO_CLWB=1
177
178 Setting this environment variable to 1 forces libpmem2 to never issue
179 the CLWB instruction on Intel hardware, falling back to other cache
180 flush instructions on that hardware instead (CLFLUSHOPT or CLFLUSH).
181 Without this setting, libpmem2 will always use the CLWB instruction for
182 flushing processor caches on platforms that support this instruction.
183 This variable is intended for use during library testing, but may be
184 required for some rare cases when using CLWB has a negative impact on
185 performance.
186
187 • PMEM_NO_CLFLUSHOPT=1
188
189 Setting this environment variable to 1 forces libpmem2 to never issue
190 the CLFLUSHOPT instruction on Intel hardware, falling back to the
191 CLFLUSH instructions instead. Without this environment variable, libp‐
192 mem2 will always use the CLFLUSHOPT instruction for flushing processor
193 caches on platforms that support the instruction, but where CLWB is not
194 available. This variable is intended for use during library testing.
195
196 • PMEM_NO_MOVNT=1
197
198 Setting this environment variable to 1 forces libpmem2 to never use the
199 non-temporal move instructions on Intel hardware. Without this envi‐
200 ronment variable, libpmem2 will use the non-temporal instructions for
201 copying larger ranges to persistent memory on platforms that support
202 these instructions. This variable is intended for use during library
203 testing.
204
205 • PMEM_MOVNT_THRESHOLD=val
206
207 This environment variable allows overriding the minimum length of the
208 pmem2_memmove_fn operations, for which libpmem2 uses non-temporal move
209 instructions. Setting this environment variable to 0 forces libpmem2
210 to always use the non-temporal move instructions if available. It has
211 no effect if PMEM_NO_MOVNT is set to 1. This variable is intended for
212 use during library testing.
213
215 Two versions of libpmem2 are typically available on a development sys‐
216 tem. The normal version, accessed when a program is linked using the
217 -lpmem2 option, is optimized for performance. That version skips
218 checks that impact performance and never logs any trace information or
219 performs any run-time assertions.
220
221 A second version of libpmem2, accessed when a program uses the li‐
222 braries under /usr/lib/pmdk_debug, contains run-time assertions and
223 trace points. The typical way to access the debug version is to set
224 the environment variable LD_LIBRARY_PATH to /usr/lib/pmdk_debug or
225 /usr/lib64/pmdk_debug, as appropriate. Debugging output is controlled
226 using the following environment variables. These variables have no ef‐
227 fect on the non-debug version of the library.
228
229 • PMEM2_LOG_LEVEL
230
231 The value of PMEM2_LOG_LEVEL enables trace points in the debug version
232 of the library, as follows:
233
234 • 0 - This is the default level when PMEM2_LOG_LEVEL is not set. No
235 log messages are emitted at this level.
236
237 • 1 - Additional details on any errors detected are logged, in addition
238 to returning the errno-based errors as usual. The same information
239 may be retrieved using pmem2_errormsg().
240
241 • 2 - A trace of basic operations is logged.
242
243 • 3 - Enables a very verbose amount of function call tracing in the li‐
244 brary.
245
246 • 4 - Enables voluminous and fairly obscure tracing information that is
247 likely only useful to the libpmem2 developers.
248
249 Unless PMEM2_LOG_FILE is set, debugging output is written to stderr.
250
251 • PMEM2_LOG_FILE
252
253 Specifies the name of a file where all logging information should be
254 written. If the last character in the name is “-”, the PID of the cur‐
255 rent process will be appended to the file name when the log file is
256 created. If PMEM2_LOG_FILE is not set, output is written to stderr.
257
259 The following example uses libpmem2 to flush changes made to raw, memo‐
260 ry-mapped persistent memory.
261
262 WARNING: There is nothing transactional about the persist from
263 pmem2_get_persist_fn(3) call in this example. Interrupting the
264 program may result in a partial write to pmem. Use a transac‐
265 tional library such as libpmemobj(7) to avoid torn updates.
266
267 The above example is described in detail here
268 (https://pmem.io/pmdk/libpmem2/).
269
271 libpmem2 builds on the persistent memory programming model recommended
272 by the SNIA NVM Programming Technical Work Group:
273 <https://snia.org/nvmp>
274
276 FlushFileBuffers(), fsync(2), msync(2), pmem2_config_set_length(3),
277 pmem2_config_set_offset(3), pmem2_config_set_required_store_granulari‐
278 ty(3), pmem2_config_set_sharing(3),pmem2_get_drain_fn(3),
279 pmem2_get_flush_fn(3), pmem2_get_memcpy_fn(3), pmem2_get_memmove_fn(3),
280 pmem2_get_memset_fn(3), pmem2_get_per‐
281 sist_fn(3),pmem2_map_get_store_granularity(3), pmem2_map_new(3),
282 pmem2_source_from_anon(3), pmem2_source_from_fd(3),
283 pmem2_source_from_handle(3), libpmem2_unsafe_shutdown(7), libpmem‐
284 blk(7), libpmemlog(7), libpmemobj(7) and <https://pmem.io>
285
286
287
288PMDK - pmem2 API version 1.0 2022-05-24 LIBPMEM2(7)