1() PMDK Programmer's Manual ()
2
3
4
6 pmemobj_ctl_get(), pmemobj_ctl_set(), pmemobj_ctl_exec() - Query and
7 modify libpmemobj internal behavior (EXPERIMENTAL)
8
10 #include <libpmemobj.h>
11
12 int pmemobj_ctl_get(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
13 int pmemobj_ctl_set(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
14 int pmemobj_ctl_exec(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
15
17 The pmemobj_ctl_get(), pmemobj_ctl_set() and pmemobj_ctl_exec() func‐
18 tions provide a uniform interface for querying and modifying the inter‐
19 nal behavior of libpmemobj(7) through the control (CTL) namespace.
20
21 The name argument specifies an entry point as defined in the CTL name‐
22 space specification. The entry point description specifies whether the
23 extra arg is required. Those two parameters together create a CTL
24 query. The functions and the entry points are thread-safe unless indi‐
25 cated otherwise below. If there are special conditions for calling an
26 entry point, they are explicitly stated in its description. The func‐
27 tions propagate the return value of the entry point. If either name or
28 arg is invalid, -1 is returned.
29
30 If the provided ctl query is valid, the CTL functions will always re‐
31 turn 0 on success and -1 on failure, unless otherwise specified in the
32 entry point description.
33
34 See more in pmem_ctl(5) man page.
35
37 prefault.at_create | rw | global | int | int | - | boolean
38
39 If set, every page of the pool will be touched and written to when the
40 pool is created, in order to trigger page allocation and minimize the
41 performance impact of pagefaults. Affects only the pmemobj_create()
42 function.
43
44 prefault.at_open | rw | global | int | int | - | boolean
45
46 If set, every page of the pool will be touched and written to when the
47 pool is opened, in order to trigger page allocation and minimize the
48 performance impact of pagefaults. Affects only the pmemobj_open()
49 function.
50
51 sds.at_create | rw | global | int | int | - | boolean
52
53 If set, force-enables or force-disables SDS feature during pool cre‐
54 ation. Affects only the pmemobj_create() function. See pmempool_fea‐
55 ture_query(3) for information about SDS (SHUTDOWN_STATE) feature.
56
57 copy_on_write.at_open | rw | global | int | int | - | boolean
58
59 If set, pool is mapped in such a way that modifications don’t reach the
60 underlying medium. From the user’s perspective this means that when
61 the pool is closed all changes are reverted. This feature is not sup‐
62 ported for pools located on Device DAX.
63
64 tx.debug.skip_expensive_checks | rw | - | int | int | - | boolean
65
66 Turns off some expensive checks performed by the transaction module in
67 “debug” builds. Ignored in “release” builds.
68
69 tx.debug.verify_user_buffers | rw | - | int | int | - | boolean
70
71 Enables verification of user buffers provided by pmemobj_tx_log_ap‐
72 pend_buffer(3) API. For now the only verified aspect is whether the
73 same buffer is used simultaneously in 2 or more transactions or more
74 than once in the same transaction. This value should not be modified
75 at runtime if any transaction for the current pool is in progress.
76
77 tx.cache.size | rw | - | long long | long long | - | integer
78
79 Size in bytes of the transaction snapshot cache. In a larger cache the
80 frequency of persistent allocations is lower, but with higher fixed
81 cost.
82
83 This should be set to roughly the sum of sizes of the snapshotted re‐
84 gions in an average transaction in the pool.
85
86 This entry point is not thread safe and should not be modified if there
87 are any transactions currently running.
88
89 This value must be a in a range between 0 and PMEMOBJ_MAX_ALLOC_SIZE,
90 otherwise this entry point will fail.
91
92 tx.cache.threshold | rw | - | long long | long long | - | integer
93
94 This entry point is deprecated. All snapshots, regardless of the size,
95 use the transactional cache.
96
97 tx.post_commit.queue_depth | rw | - | int | int | - | integer
98
99 This entry point is deprecated.
100
101 tx.post_commit.worker | r- | - | void * | - | - | -
102
103 This entry point is deprecated.
104
105 tx.post_commit.stop | r- | - | void * | - | - | -
106
107 This entry point is deprecated.
108
109 heap.narenas.automatic | r- | - | unsigned | - | - | -
110
111 Reads the number of arenas used in automatic scheduling of memory oper‐
112 ations for threads. By default, this value is equal to the number of
113 available processors. An arena is a memory management structure which
114 enables concurrency by taking exclusive ownership of parts of the heap
115 and allowing associated threads to allocate without contention.
116
117 heap.narenas.total | r- | - | unsigned | - | - | -
118
119 Reads the number of all created arenas. It includes automatic arenas
120 created by default and arenas created using heap.arena.create CTL.
121
122 heap.narenas.max | rw- | - | unsigned | unsigned | - | -
123
124 Reads or writes the maximum number of arenas that can be created. This
125 entry point is not thread-safe with regards to heap operations (alloca‐
126 tions, frees, reallocs).
127
128 heap.arena.[arena_id].size | r- | - | uint64_t | - | - | -
129
130 Reads the total amount of memory in bytes which is currently exclusive‐
131 ly owned by the arena. Large differences in this value between arenas
132 might indicate an uneven scheduling of memory resources. The arena id
133 cannot be 0.
134
135 heap.thread.arena_id | rw- | - | unsigned | unsigned | - | -
136
137 Reads the index of the arena assigned to the current thread or assigns
138 arena with specific id to the current thread. The arena id cannot be
139 0.
140
141 heap.arena.create | –x | - | - | - | unsigned | -
142
143 Creates and initializes one new arena in the heap. This entry point
144 reads an id of the new created arena.
145
146 Newly created arenas by this CTL are inactive, which means that the
147 arena will not be used in the automatic scheduling of memory requests.
148 To activate the new arena, use heap.arena.[arena_id].automatic CTL.
149
150 Arena created using this CTL can be used for allocation by explicitly
151 specifying the arena_id for POBJ_ARENA_ID(id) flag in pmemobj_tx_xal‐
152 loc()/pmemobj_xalloc()/pmemobj_xreserve() functions.
153
154 By default, the number of arenas is limited to 1024.
155
156 heap.arena.[arena_id].automatic | rw- | - | boolean | boolean | - | -
157
158 Reads or modifies the state of the arena. If set, the arena is used in
159 automatic scheduling of memory operations for threads. This should be
160 set to false if the application wants to manually manage allocator
161 scalability through explicitly assigning arenas to threads by using
162 heap.thread.arena_id. The arena id cannot be 0 and at least one auto‐
163 matic arena must exist.
164
165 heap.arenas_assignment_type | rw | global | enum pobj_arenas_assign‐
166 ment_type | enum pobj_arenas_assignment_type | - | string
167
168 Reads or modifies the behavior of arenas assignment for threads. By
169 default, each thread is assigned its own arena from the pool of auto‐
170 matic arenas (described earlier). This consumes one TLS key from the
171 OS for every open pool. Applications that wish to avoid this behavior
172 can instead rely on one global arena assignment per pool. This might
173 limits scalability if not using arenas explicitly.
174
175 The argument for this CTL is an enum with the following types:
176
177 • POBJ_ARENAS_ASSIGNMENT_THREAD_KEY, string value: thread. Default,
178 threads use individually assigned arenas.
179
180 • POBJ_ARENAS_ASSIGNMENT_GLOBAL, string value: global. Threads use one
181 global arena.
182
183 Changing this value has no impact on already open pools. It should
184 typically be set at the beginning of the application, before any pools
185 are opened or created.
186
187 heap.arenas_default_max | rw- | global | unsigned | unsigned | - | in‐
188 teger
189
190 Reads or writes the number of arenas that are created by default on
191 startup of the heap’s runtime state. This value by default is equal to
192 the number of online CPUs available on the platform, but can be de‐
193 creased or increased depending on application’s scalability require‐
194 ments.
195
196 heap.alloc_class.[class_id].desc | rw | - | struct pobj_alloc_class_de‐
197 sc | struct pobj_alloc_class_desc | - | integer, integer, integer,
198 string
199
200 Describes an allocation class. Allows one to create or view the inter‐
201 nal data structures of the allocator.
202
203 Creating custom allocation classes can be beneficial for both raw allo‐
204 cation throughput, scalability and, most importantly, fragmentation.
205 By carefully constructing allocation classes that match the application
206 workload, one can entirely eliminate external and internal fragmenta‐
207 tion. For example, it is possible to easily construct a slab-like al‐
208 location mechanism for any data structure.
209
210 The [class_id] is an index field. Only values between 0-254 are valid.
211 If setting an allocation class, but the class_id is already taken, the
212 function will return -1. The values between 0-127 are reserved for the
213 default allocation classes of the library and can be used only for
214 reading.
215
216 The recommended method for retrieving information about all allocation
217 classes is to call this entry point for all class ids between 0 and 254
218 and discard those results for which the function returns an error.
219
220 This entry point takes a complex argument.
221
222 struct pobj_alloc_class_desc {
223 size_t unit_size;
224 size_t alignment;
225 unsigned units_per_block;
226 enum pobj_header_type header_type;
227 unsigned class_id;
228 };
229
230 The first field, unit_size, is an 8-byte unsigned integer that defines
231 the allocation class size. While theoretically limited only by PMEMO‐
232 BJ_MAX_ALLOC_SIZE, for most workloads this value should be between 8
233 bytes and 2 megabytes.
234
235 The alignment field specifies the user data alignment of objects allo‐
236 cated using the class. If set, must be a power of two and an even di‐
237 visor of unit size. Alignment is limited to maximum of 2 megabytes.
238 All objects have default alignment of 64 bytes, but the user data
239 alignment is affected by the size of the chosen header.
240
241 The units_per_block field defines how many units a single block of mem‐
242 ory contains. This value will be adjusted to match the internal size
243 of the block (256 kilobytes or a multiple thereof). For example, given
244 a class with a unit_size of 512 bytes and a units_per_block of 1000, a
245 single block of memory for that class will have 512 kilobytes. This is
246 relevant because the bigger the block size, the less frequently blocks
247 need to be fetched, resulting in lower contention on global heap state.
248 If the CTL call is being done at runtime, the units_per_block variable
249 of the provided alloc class structure is modified to match the actual
250 value.
251
252 The header_type field defines the header of objects from the allocation
253 class. There are three types:
254
255 • POBJ_HEADER_LEGACY, string value: legacy. Used for allocation class‐
256 es prior to version 1.3 of the library. Not recommended for use.
257 Incurs a 64 byte metadata overhead for every object. Fully supports
258 all features.
259
260 • POBJ_HEADER_COMPACT, string value: compact. Used as default for all
261 predefined allocation classes. Incurs a 16 byte metadata overhead
262 for every object. Fully supports all features.
263
264 • POBJ_HEADER_NONE, string value: none. Header type that incurs no
265 metadata overhead beyond a single bitmap entry. Can be used for very
266 small allocation classes or when objects must be adjacent to each
267 other. This header type does not support type numbers (type number
268 is always
269
270 0) or allocations that span more than one unit.
271
272 The class_id field is an optional, runtime-only variable that allows
273 the user to retrieve the identifier of the class. This will be equiva‐
274 lent to the provided [class_id]. This field cannot be set from a con‐
275 fig file.
276
277 The allocation classes are a runtime state of the library and must be
278 created after every open. It is highly recommended to use the configu‐
279 ration file to store the classes.
280
281 This structure is declared in the libpmemobj/ctl.h header file. Please
282 refer to this file for an in-depth explanation of the allocation class‐
283 es and relevant algorithms.
284
285 Allocation classes constructed in this way can be leveraged by explic‐
286 itly specifying the class using POBJ_CLASS_ID(id) flag in pmemo‐
287 bj_tx_xalloc()/pmemobj_xalloc() functions.
288
289 Example of a valid alloc class query string:
290
291 heap.alloc_class.128.desc=500,0,1000,compact
292
293 This query, if executed, will create an allocation class with an id of
294 128 that has a unit size of 500 bytes, has at least 1000 units per
295 block and uses a compact header.
296
297 For reading, function returns 0 if successful, if the allocation class
298 does not exist it sets the errno to ENOENT and returns -1;
299
300 This entry point can fail if any of the parameters of the allocation
301 class is invalid or if exactly the same class already exists.
302
303 heap.alloc_class.new.desc | -w | - | - | struct pobj_alloc_class_desc |
304 - | integer, integer, integer, string
305
306 Same as heap.alloc_class.[class_id].desc, but instead of requiring the
307 user to provide the class_id, it automatically creates the allocation
308 class with the first available identifier.
309
310 This should be used when it’s impossible to guarantee unique allocation
311 class naming in the application (e.g. when writing a library that uses
312 libpmemobj).
313
314 The required class identifier will be stored in the class_id field of
315 the struct pobj_alloc_class_desc.
316
317 stats.enabled | rw | - | enum pobj_stats_enabled | enum pobj_stats_en‐
318 abled | - | string
319
320 Enables or disables runtime collection of statistics. There are two
321 types of statistics: persistent and transient ones. Persistent statis‐
322 tics survive pool restarts, whereas transient ones don’t. Statistics
323 are not recalculated after enabling; any operations that occur between
324 disabling and re-enabling will not be reflected in subsequent values.
325
326 Only transient statistics are enabled by default. Enabling persistent
327 statistics may have non-trivial performance impact.
328
329 stats.heap.curr_allocated | r- | - | uint64_t | - | - | -
330
331 Reads the number of bytes currently allocated in the heap. If statis‐
332 tics were disabled at any time in the lifetime of the heap, this value
333 may be inaccurate.
334
335 This is a persistent statistic.
336
337 stats.heap.run_allocated | r- | - | uint64_t | - | - | -
338
339 Reads the number of bytes currently allocated using run-based alloca‐
340 tion classes, i.e., huge allocations are not accounted for in this
341 statistic. This is useful for comparison against stats.heap.run_active
342 to estimate the ratio between active and allocated memory.
343
344 This is a transient statistic and is rebuilt every time the pool is
345 opened.
346
347 stats.heap.run_active | r- | - | uint64_t | - | - | -
348
349 Reads the number of bytes currently occupied by all run memory blocks,
350 including both allocated and free space, i.e., this is all the all
351 space that’s not occupied by huge allocations.
352
353 This value is a sum of all allocated and free run memory. In systems
354 where memory is efficiently used, run_active should closely track
355 run_allocated, and the amount of active, but free, memory should be
356 minimal.
357
358 A large relative difference between active memory and allocated memory
359 is indicative of heap fragmentation. This information can be used to
360 make a decision to call pmemobj_defrag()[22m(3) if the fragmentation looks
361 to be high.
362
363 However, for small heaps run_active might be disproportionately higher
364 than run_allocated because the allocator typically activates a signifi‐
365 cantly larger amount of memory than is required to satisfy a single re‐
366 quest in the anticipation of future needs. For example, the first al‐
367 location of 100 bytes in a heap will trigger activation of 256 kilo‐
368 bytes of space.
369
370 This is a transient statistic and is rebuilt lazily every time the pool
371 is opened.
372
373 heap.size.granularity | rw- | - | uint64_t | uint64_t | - | long long
374
375 Reads or modifies the granularity with which the heap grows when OOM.
376 Valid only if the poolset has been defined with directories.
377
378 A granularity of 0 specifies that the pool will not grow automatically.
379
380 This entry point can fail if the granularity value is non-zero and
381 smaller than PMEMOBJ_MIN_PART.
382
383 heap.size.extend | –x | - | - | - | uint64_t | -
384
385 Extends the heap by the given size. Must be larger than PMEMO‐
386 BJ_MIN_PART.
387
388 This entry point can fail if the pool does not support extend function‐
389 ality or if there’s not enough space left on the device.
390
391 debug.heap.alloc_pattern | rw | - | int | int | - | -
392
393 Single byte pattern that is used to fill new uninitialized memory allo‐
394 cation. If the value is negative, no pattern is written. This is in‐
395 tended for debugging, and is disabled by default.
396
398 In addition to direct function call, each write entry point can also be
399 set using two alternative methods.
400
401 The first method is to load a configuration directly from the PMEMO‐
402 BJ_CONF environment variable.
403
404 The second method of loading an external configuration is to set the
405 PMEMOBJ_CONF_FILE environment variable to point to a file that contains
406 a sequence of ctl queries.
407
408 See more in pmem_ctl(5) man page.
409
411 libpmemobj(7), pmem_ctl(5) and <https://pmem.io>
412
413
414
415PMDK - 2023-06-05 ()