1()                         PMDK Programmer's Manual                         ()
2
3
4

NAME

6       pmemobj_ctl_get(),  pmemobj_ctl_set(),  pmemobj_ctl_exec()  - Query and
7       modify libpmemobj internal behavior (EXPERIMENTAL)
8

SYNOPSIS

10              #include <libpmemobj.h>
11
12              int pmemobj_ctl_get(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
13              int pmemobj_ctl_set(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
14              int pmemobj_ctl_exec(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
15

DESCRIPTION

17       The pmemobj_ctl_get(), pmemobj_ctl_set() and  pmemobj_ctl_exec()  func‐
18       tions provide a uniform interface for querying and modifying the inter‐
19       nal behavior of libpmemobj(7) through the control (CTL) namespace.
20
21       The name argument specifies an entry point as defined in the CTL  name‐
22       space specification.  The entry point description specifies whether the
23       extra arg is required.  Those two  parameters  together  create  a  CTL
24       query.  The functions and the entry points are thread-safe unless indi‐
25       cated otherwise below.  If there are special conditions for calling  an
26       entry  point, they are explicitly stated in its description.  The func‐
27       tions propagate the return value of the entry point.  If either name or
28       arg is invalid, -1 is returned.
29
30       If  the  provided ctl query is valid, the CTL functions will always re‐
31       turn 0 on success and -1 on failure, unless otherwise specified in  the
32       entry point description.
33
34       See more in pmem_ctl(5) man page.
35

CTL NAMESPACE

37       prefault.at_create | rw | global | int | int | - | boolean
38
39       If  set, every page of the pool will be touched and written to when the
40       pool is created, in order to trigger page allocation and  minimize  the
41       performance  impact  of  pagefaults.  Affects only the pmemobj_create()
42       function.
43
44       prefault.at_open | rw | global | int | int | - | boolean
45
46       If set, every page of the pool will be touched and written to when  the
47       pool  is  opened,  in order to trigger page allocation and minimize the
48       performance impact of  pagefaults.   Affects  only  the  pmemobj_open()
49       function.
50
51       sds.at_create | rw | global | int | int | - | boolean
52
53       If  set,  force-enables  or force-disables SDS feature during pool cre‐
54       ation.  Affects only the pmemobj_create() function.  See  pmempool_fea‐
55       ture_query(3) for information about SDS (SHUTDOWN_STATE) feature.
56
57       copy_on_write.at_open | rw | global | int | int | - | boolean
58
59       If set, pool is mapped in such a way that modifications don’t reach the
60       underlying medium.  From the user’s perspective this  means  that  when
61       the  pool is closed all changes are reverted.  This feature is not sup‐
62       ported for pools located on Device DAX.
63
64       tx.debug.skip_expensive_checks | rw | - | int | int | - | boolean
65
66       Turns off some expensive checks performed by the transaction module  in
67       “debug” builds.  Ignored in “release” builds.
68
69       tx.debug.verify_user_buffers | rw | - | int | int | - | boolean
70
71       Enables  verification  of  user  buffers provided by pmemobj_tx_log_ap‐
72       pend_buffer(3) API.  For now the only verified aspect  is  whether  the
73       same  buffer  is  used simultaneously in 2 or more transactions or more
74       than once in the same transaction.  This value should not  be  modified
75       at runtime if any transaction for the current pool is in progress.
76
77       tx.cache.size | rw | - | long long | long long | - | integer
78
79       Size in bytes of the transaction snapshot cache.  In a larger cache the
80       frequency of persistent allocations is lower,  but  with  higher  fixed
81       cost.
82
83       This  should  be set to roughly the sum of sizes of the snapshotted re‐
84       gions in an average transaction in the pool.
85
86       This entry point is not thread safe and should not be modified if there
87       are any transactions currently running.
88
89       This  value  must be a in a range between 0 and PMEMOBJ_MAX_ALLOC_SIZE,
90       otherwise this entry point will fail.
91
92       tx.cache.threshold | rw | - | long long | long long | - | integer
93
94       This entry point is deprecated.  All snapshots, regardless of the size,
95       use the transactional cache.
96
97       tx.post_commit.queue_depth | rw | - | int | int | - | integer
98
99       This entry point is deprecated.
100
101       tx.post_commit.worker | r- | - | void * | - | - | -
102
103       This entry point is deprecated.
104
105       tx.post_commit.stop | r- | - | void * | - | - | -
106
107       This entry point is deprecated.
108
109       heap.narenas.automatic | r- | - | unsigned | - | - | -
110
111       Reads the number of arenas used in automatic scheduling of memory oper‐
112       ations for threads.  By default, this value is equal to the  number  of
113       available  processors.  An arena is a memory management structure which
114       enables concurrency by taking exclusive ownership of parts of the  heap
115       and allowing associated threads to allocate without contention.
116
117       heap.narenas.total | r- | - | unsigned | - | - | -
118
119       Reads  the  number of all created arenas.  It includes automatic arenas
120       created by default and arenas created using heap.arena.create CTL.
121
122       heap.narenas.max | rw- | - | unsigned | unsigned | - | -
123
124       Reads or writes the maximum number of arenas that can be created.  This
125       entry point is not thread-safe with regards to heap operations (alloca‐
126       tions, frees, reallocs).
127
128       heap.arena.[arena_id].size | r- | - | uint64_t | - | - | -
129
130       Reads the total amount of memory in bytes which is currently exclusive‐
131       ly  owned by the arena.  Large differences in this value between arenas
132       might indicate an uneven scheduling of memory resources.  The arena  id
133       cannot be 0.
134
135       heap.thread.arena_id | rw- | - | unsigned | unsigned | - | -
136
137       Reads  the index of the arena assigned to the current thread or assigns
138       arena with specific id to the current thread.  The arena id  cannot  be
139       0.
140
141       heap.arena.create | –x | - | - | - | unsigned | -
142
143       Creates  and  initializes  one new arena in the heap.  This entry point
144       reads an id of the new created arena.
145
146       Newly created arenas by this CTL are inactive,  which  means  that  the
147       arena  will not be used in the automatic scheduling of memory requests.
148       To activate the new arena, use heap.arena.[arena_id].automatic CTL.
149
150       Arena created using this CTL can be used for allocation  by  explicitly
151       specifying  the  arena_id for POBJ_ARENA_ID(id) flag in pmemobj_tx_xal‐
152       loc()/pmemobj_xalloc()/pmemobj_xreserve() functions.
153
154       By default, the number of arenas is limited to 1024.
155
156       heap.arena.[arena_id].automatic | rw- | - | boolean | boolean | - | -
157
158       Reads or modifies the state of the arena.  If set, the arena is used in
159       automatic  scheduling of memory operations for threads.  This should be
160       set to false if the application  wants  to  manually  manage  allocator
161       scalability  through  explicitly  assigning  arenas to threads by using
162       heap.thread.arena_id.  The arena id cannot be 0 and at least one  auto‐
163       matic arena must exist.
164
165       heap.arenas_assignment_type  |  rw  | global | enum pobj_arenas_assign‐
166       ment_type | enum pobj_arenas_assignment_type | - | string
167
168       Reads or modifies the behavior of arenas assignment  for  threads.   By
169       default,  each  thread is assigned its own arena from the pool of auto‐
170       matic arenas (described earlier).  This consumes one TLS key  from  the
171       OS  for every open pool.  Applications that wish to avoid this behavior
172       can instead rely on one global arena assignment per pool.   This  might
173       limits scalability if not using arenas explicitly.
174
175       The argument for this CTL is an enum with the following types:
176
177POBJ_ARENAS_ASSIGNMENT_THREAD_KEY,  string  value:  thread.  Default,
178         threads use individually assigned arenas.
179
180POBJ_ARENAS_ASSIGNMENT_GLOBAL, string value: global.  Threads use one
181         global arena.
182
183       Changing  this  value  has  no impact on already open pools.  It should
184       typically be set at the beginning of the application, before any  pools
185       are opened or created.
186
187       heap.arenas_default_max  | rw- | global | unsigned | unsigned | - | in‐
188       teger
189
190       Reads or writes the number of arenas that are  created  by  default  on
191       startup of the heap’s runtime state.  This value by default is equal to
192       the number of online CPUs available on the platform,  but  can  be  de‐
193       creased  or  increased  depending on application’s scalability require‐
194       ments.
195
196       heap.alloc_class.[class_id].desc | rw | - | struct pobj_alloc_class_de‐
197       sc  |  struct  pobj_alloc_class_desc  |  - | integer, integer, integer,
198       string
199
200       Describes an allocation class.  Allows one to create or view the inter‐
201       nal data structures of the allocator.
202
203       Creating custom allocation classes can be beneficial for both raw allo‐
204       cation throughput, scalability and,  most  importantly,  fragmentation.
205       By carefully constructing allocation classes that match the application
206       workload, one can entirely eliminate external and  internal  fragmenta‐
207       tion.   For example, it is possible to easily construct a slab-like al‐
208       location mechanism for any data structure.
209
210       The [class_id] is an index field.  Only values between 0-254 are valid.
211       If  setting an allocation class, but the class_id is already taken, the
212       function will return -1.  The values between 0-127 are reserved for the
213       default  allocation  classes  of  the  library and can be used only for
214       reading.
215
216       The recommended method for retrieving information about all  allocation
217       classes is to call this entry point for all class ids between 0 and 254
218       and discard those results for which the function returns an error.
219
220       This entry point takes a complex argument.
221
222              struct pobj_alloc_class_desc {
223                  size_t unit_size;
224                  size_t alignment;
225                  unsigned units_per_block;
226                  enum pobj_header_type header_type;
227                  unsigned class_id;
228              };
229
230       The first field, unit_size, is an 8-byte unsigned integer that  defines
231       the  allocation class size.  While theoretically limited only by PMEMO‐
232       BJ_MAX_ALLOC_SIZE, for most workloads this value should  be  between  8
233       bytes and 2 megabytes.
234
235       The  alignment field specifies the user data alignment of objects allo‐
236       cated using the class.  If set, must be a power of two and an even  di‐
237       visor  of  unit  size.  Alignment is limited to maximum of 2 megabytes.
238       All objects have default alignment of  64  bytes,  but  the  user  data
239       alignment is affected by the size of the chosen header.
240
241       The units_per_block field defines how many units a single block of mem‐
242       ory contains.  This value will be adjusted to match the  internal  size
243       of the block (256 kilobytes or a multiple thereof).  For example, given
244       a class with a unit_size of 512 bytes and a units_per_block of 1000,  a
245       single block of memory for that class will have 512 kilobytes.  This is
246       relevant because the bigger the block size, the less frequently  blocks
247       need to be fetched, resulting in lower contention on global heap state.
248       If the CTL call is being done at runtime, the units_per_block  variable
249       of  the  provided alloc class structure is modified to match the actual
250       value.
251
252       The header_type field defines the header of objects from the allocation
253       class.  There are three types:
254
255POBJ_HEADER_LEGACY, string value: legacy.  Used for allocation class‐
256         es prior to version 1.3 of the library.   Not  recommended  for  use.
257         Incurs  a 64 byte metadata overhead for every object.  Fully supports
258         all features.
259
260POBJ_HEADER_COMPACT, string value: compact.  Used as default for  all
261         predefined  allocation  classes.   Incurs a 16 byte metadata overhead
262         for every object.  Fully supports all features.
263
264POBJ_HEADER_NONE, string value: none.  Header  type  that  incurs  no
265         metadata overhead beyond a single bitmap entry.  Can be used for very
266         small allocation classes or when objects must  be  adjacent  to  each
267         other.   This  header type does not support type numbers (type number
268         is always
269
270         0) or allocations that span more than one unit.
271
272       The class_id field is an optional, runtime-only  variable  that  allows
273       the user to retrieve the identifier of the class.  This will be equiva‐
274       lent to the provided [class_id].  This field cannot be set from a  con‐
275       fig file.
276
277       The  allocation  classes are a runtime state of the library and must be
278       created after every open.  It is highly recommended to use the configu‐
279       ration file to store the classes.
280
281       This structure is declared in the libpmemobj/ctl.h header file.  Please
282       refer to this file for an in-depth explanation of the allocation class‐
283       es and relevant algorithms.
284
285       Allocation  classes constructed in this way can be leveraged by explic‐
286       itly specifying  the  class  using  POBJ_CLASS_ID(id)  flag  in  pmemo‐
287       bj_tx_xalloc()/pmemobj_xalloc() functions.
288
289       Example of a valid alloc class query string:
290
291              heap.alloc_class.128.desc=500,0,1000,compact
292
293       This  query, if executed, will create an allocation class with an id of
294       128 that has a unit size of 500 bytes, has  at  least  1000  units  per
295       block and uses a compact header.
296
297       For  reading, function returns 0 if successful, if the allocation class
298       does not exist it sets the errno to ENOENT and returns -1;
299
300       This entry point can fail if any of the parameters  of  the  allocation
301       class is invalid or if exactly the same class already exists.
302
303       heap.alloc_class.new.desc | -w | - | - | struct pobj_alloc_class_desc |
304       - | integer, integer, integer, string
305
306       Same as heap.alloc_class.[class_id].desc, but instead of requiring  the
307       user  to  provide the class_id, it automatically creates the allocation
308       class with the first available identifier.
309
310       This should be used when it’s impossible to guarantee unique allocation
311       class  naming in the application (e.g. when writing a library that uses
312       libpmemobj).
313
314       The required class identifier will be stored in the class_id  field  of
315       the struct pobj_alloc_class_desc.
316
317       stats.enabled  | rw | - | enum pobj_stats_enabled | enum pobj_stats_en‐
318       abled | - | string
319
320       Enables or disables runtime collection of statistics.   There  are  two
321       types of statistics: persistent and transient ones.  Persistent statis‐
322       tics survive pool restarts, whereas transient ones  don’t.   Statistics
323       are  not recalculated after enabling; any operations that occur between
324       disabling and re-enabling will not be reflected in subsequent values.
325
326       Only transient statistics are enabled by default.  Enabling  persistent
327       statistics may have non-trivial performance impact.
328
329       stats.heap.curr_allocated | r- | - | uint64_t | - | - | -
330
331       Reads  the number of bytes currently allocated in the heap.  If statis‐
332       tics were disabled at any time in the lifetime of the heap, this  value
333       may be inaccurate.
334
335       This is a persistent statistic.
336
337       stats.heap.run_allocated | r- | - | uint64_t | - | - | -
338
339       Reads  the  number of bytes currently allocated using run-based alloca‐
340       tion classes, i.e., huge allocations are  not  accounted  for  in  this
341       statistic.  This is useful for comparison against stats.heap.run_active
342       to estimate the ratio between active and allocated memory.
343
344       This is a transient statistic and is rebuilt every  time  the  pool  is
345       opened.
346
347       stats.heap.run_active | r- | - | uint64_t | - | - | -
348
349       Reads  the number of bytes currently occupied by all run memory blocks,
350       including both allocated and free space, i.e.,  this  is  all  the  all
351       space that’s not occupied by huge allocations.
352
353       This  value  is a sum of all allocated and free run memory.  In systems
354       where memory is  efficiently  used,  run_active  should  closely  track
355       run_allocated,  and  the  amount  of active, but free, memory should be
356       minimal.
357
358       A large relative difference between active memory and allocated  memory
359       is  indicative  of heap fragmentation.  This information can be used to
360       make a decision to call pmemobj_defrag()(3) if the fragmentation  looks
361       to be high.
362
363       However,  for small heaps run_active might be disproportionately higher
364       than run_allocated because the allocator typically activates a signifi‐
365       cantly larger amount of memory than is required to satisfy a single re‐
366       quest in the anticipation of future needs.  For example, the first  al‐
367       location  of  100  bytes in a heap will trigger activation of 256 kilo‐
368       bytes of space.
369
370       This is a transient statistic and is rebuilt lazily every time the pool
371       is opened.
372
373       heap.size.granularity | rw- | - | uint64_t | uint64_t | - | long long
374
375       Reads  or  modifies the granularity with which the heap grows when OOM.
376       Valid only if the poolset has been defined with directories.
377
378       A granularity of 0 specifies that the pool will not grow automatically.
379
380       This entry point can fail if the  granularity  value  is  non-zero  and
381       smaller than PMEMOBJ_MIN_PART.
382
383       heap.size.extend | –x | - | - | - | uint64_t | -
384
385       Extends  the  heap  by  the  given  size.   Must  be larger than PMEMO‐
386       BJ_MIN_PART.
387
388       This entry point can fail if the pool does not support extend function‐
389       ality or if there’s not enough space left on the device.
390
391       debug.heap.alloc_pattern | rw | - | int | int | - | -
392
393       Single byte pattern that is used to fill new uninitialized memory allo‐
394       cation.  If the value is negative, no pattern is written.  This is  in‐
395       tended for debugging, and is disabled by default.
396

CTL EXTERNAL CONFIGURATION

398       In addition to direct function call, each write entry point can also be
399       set using two alternative methods.
400
401       The first method is to load a configuration directly  from  the  PMEMO‐
402       BJ_CONF environment variable.
403
404       The  second  method  of loading an external configuration is to set the
405       PMEMOBJ_CONF_FILE environment variable to point to a file that contains
406       a sequence of ctl queries.
407
408       See more in pmem_ctl(5) man page.
409

SEE ALSO

411       libpmemobj(7), pmem_ctl(5) and <https://pmem.io>
412
413
414
415PMDK -                            2023-06-05                                ()
Impressum