1PMEMOBJ_CTL_GET(3)         PMDK Programmer's Manual         PMEMOBJ_CTL_GET(3)
2
3
4

NAME

6       pmemobj_ctl_get(),  pmemobj_ctl_set(),  pmemobj_ctl_exec() -- Query and
7       modify libpmemobj internal behavior
8

SYNOPSIS

10              #include <libpmemobj.h>
11
12              int pmemobj_ctl_get(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
13              int pmemobj_ctl_set(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
14              int pmemobj_ctl_exec(PMEMobjpool *pop, const char *name, void *arg); (EXPERIMENTAL)
15

DESCRIPTION

17       The pmemobj_ctl_get(), pmemobj_ctl_set() and  pmemobj_ctl_exec()  func‐
18       tions provide a uniform interface for querying and modifying the inter‐
19       nal behavior of libpmemobj through the control (CTL) namespace.
20
21       The CTL namespace is organized in a tree structure.  Starting from  the
22       root, each node can be either internal, containing other elements, or a
23       leaf.  Internal nodes themselves can only contain other nodes and  can‐
24       not be entry points.  There are two types of those nodes: named and in‐
25       dexed.  Named nodes have string identifiers.  Indexed  nodes  represent
26       an  abstract array index and have an associated string identifier.  The
27       index itself is provided by the user.  A collection of indexes  present
28       on  the  path of an entry point is provided to the handler functions as
29       name and index pairs.
30
31       The name argument specifies an entry point as defined in the CTL names‐
32       pace  specification.  The entry point description specifies whether the
33       extra arg is required.  Those two  parameters  together  create  a  CTL
34       query.   The  pop  argument is optional if the entry point resides in a
35       global namespace (i.e., is shared for all the  pools).   The  functions
36       and  the entry points are thread-safe unless indicated otherwise below.
37       If there are special conditions for calling an entry  point,  they  are
38       explicitly  stated in its description.  The functions propagate the re‐
39       turn value of the entry point.  If either name or arg is invalid, -1 is
40       returned.
41
42       Entry points are the leaves of the CTL namespace structure.  Each entry
43       point can read from the internal state, write to  the  internal  state,
44       exec a function or a combination of these operations.
45
46       The entry points are listed in the following format:
47
48       name  |  r(ead)w(rite)x(ecute)  | global/- | read argument type | write
49       argument type | exec argument type | config argument type
50
51       description...
52

CTL NAMESPACE

54       prefault.at_create | rw | global | int | int | - | boolean
55
56       If set, every page of the pool will be touched and written to when  the
57       pool  is  created, in order to trigger page allocation and minimize the
58       performance impact of pagefaults.  Affects  only  the  pmemobj_create()
59       function.
60
61       Always returns 0.
62
63       prefault.at_open | rw | global | int | int | - | boolean
64
65       If  set, every page of the pool will be touched and written to when the
66       pool is opened, in order to trigger page allocation  and  minimize  the
67       performance  impact  of  pagefaults.   Affects  only the pmemobj_open()
68       function.
69
70       Always returns 0.
71
72       tx.debug.skip_expensive_checks | rw | - | int | int | - | boolean
73
74       Turns off some expensive checks performed by the transaction module  in
75       "debug" builds.  Ignored in "release" builds.
76
77       tx.cache.size | rw | - | long long | long long | - | integer
78
79       Size in bytes of the transaction snapshot cache.  In a larger cache the
80       frequency of persistent allocations is lower,  but  with  higher  fixed
81       cost.
82
83       This  should  be set to roughly the sum of sizes of the snapshotted re‐
84       gions in an average transaction in the pool.
85
86       This value must be a in a range between 0  and  PMEMOBJ_MAX_ALLOC_SIZE.
87       If the current threshold is larger than the new cache size, the thresh‐
88       old will be made equal to the new size.
89
90       This entry point is not thread safe and should not be modified if there
91       are any transactions currently running.
92
93       Returns 0 if successful, -1 otherwise.
94
95       tx.cache.threshold | rw | - | long long | long long | - | integer
96
97       Threshold  in  bytes,  below  which  snapshots will use the cache.  All
98       larger snapshots will trigger a persistent allocation.
99
100       This value must be a in a range between 0 and tx.cache.size.
101
102       This entry point is not thread safe and should not be modified if there
103       are any transactions currently running.
104
105       Returns 0 if successful, -1 otherwise.
106
107       tx.post_commit.queue_depth | rw | - | int | int | - | integer
108
109       Controls  the depth of the post-commit tasks queue.  A post-commit task
110       is the collection of work items that need to be performed on  the  per‐
111       sistent  state  after  a  successfully completed transaction.  This in‐
112       cludes freeing no longer needed objects and cleaning up various caches.
113       By  default, this queue does not exist and the post-commit task is exe‐
114       cuted synchronously in the same thread that ran  the  transaction.   By
115       changing  this parameter, one can offload this task to a separate work‐
116       er.  If the queue is full, the algorithm, instead of waiting,  performs
117       the post-commit in the current thread.
118
119       The  task  is performed on a finite resource (lanes, of which there are
120       1024), and if the worker threads that process this queue are unable  to
121       keep  up  with the demand, regular threads might start to block waiting
122       for that resource.  This will happen if the queue depth  value  is  too
123       large.
124
125       As a general rule, this value should be set to approximately 1024 minus
126       the average number of threads in  the  application  (not  counting  the
127       post-commit workers); however, this may vary from workload to workload.
128
129       The queue depth value must also be a power of two.
130
131       This entry point is not thread-safe and must be called when no transac‐
132       tions are currently being executed.
133
134       Returns 0 if successful, -1 otherwise.
135
136       tx.post_commit.worker | r- | - | void * | - | - | -
137
138       The worker function launched in a thread to perform  asynchronous  pro‐
139       cessing  of post-commit tasks.  This function returns only after a stop
140       entry point is called.  There may be many worker threads at a time.  If
141       there is no work to be done, this function sleeps instead of polling.
142
143       Always returns 0.
144
145       tx.post_commit.stop | r- | - | void * | - | - | -
146
147       This  function  forces all the post-commit worker functions to exit and
148       return control back to the calling thread.  This should be  called  be‐
149       fore the application terminates and the post commit worker threads need
150       to be shutdown.
151
152       After the invocation of this entry point, the  post-commit  task  queue
153       can  no  longer  be  used.  If worker threads must be restarted after a
154       stop, the tx.post_commit.queue_depth needs to be set again.
155
156       This entry point must be called when no transactions are currently  be‐
157       ing executed.
158
159       Always returns 0.
160
161       heap.alloc_class.[class_id].desc | rw | - | struct pobj_alloc_class_de‐
162       sc | struct pobj_alloc_class_desc | - | integer, integer, string
163
164       Describes an allocation class.  Allows one to create or view the inter‐
165       nal data structures of the allocator.
166
167       Creating custom allocation classes can be beneficial for both raw allo‐
168       cation throughput, scalability and,  most  importantly,  fragmentation.
169       By carefully constructing allocation classes that match the application
170       workload, one can entirely eliminate external and  internal  fragmenta‐
171       tion.   For example, it is possible to easily construct a slab-like al‐
172       location mechanism for any data structure.
173
174       The [class_id] is an index field.  Only values between 0-254 are valid.
175       If  setting an allocation class, but the class_id is already taken, the
176       function will return -1.  The values between 0-127 are reserved for the
177       default  allocation  classes  of  the  library and can be used only for
178       reading.
179
180       The recommended method for retrieving information about all  allocation
181       classes is to call this entry point for all class ids between 0 and 254
182       and discard those results for which the function returns an error.
183
184       This entry point takes a complex argument.
185
186              struct pobj_alloc_class_desc {
187                  size_t unit_size;
188                  size_t alignment;
189                  unsigned units_per_block;
190                  enum pobj_header_type header_type;
191                  unsigned class_id;
192              };
193
194       The first field, unit_size, is an 8-byte unsigned integer that  defines
195       the  allocation class size.  While theoretically limited only by PMEMO‐
196       BJ_MAX_ALLOC_SIZE, for most workloads this value should  be  between  8
197       bytes and 2 megabytes.
198
199       The alignment field is currently unsupported and must be set to 0.  All
200       objects have default alignment of 64 bytes, but the user data alignment
201       is affected by the size of the chosen header.
202
203       The units_per_block field defines how many units a single block of mem‐
204       ory contains.  This value will be rounded up to match the internal size
205       of the block (256 kilobytes or a multiple thereof).  For example, given
206       a class with a unit_size of 512 bytes and a units_per_block of 1000,  a
207       single block of memory for that class will have 512 kilobytes.  This is
208       relevant because the bigger the block size, the less frequently  blocks
209       need to be fetched, resulting in lower contention on global heap state.
210       Keep in mind that object allocation is tracked in a bitmap with a  lim‐
211       ited  number  of  entries,  making  it inefficient to create allocation
212       classes smaller than 128 bytes.
213
214       The header_type field defines the header of objects from the allocation
215       class.  There are three types:
216
217       · POBJ_HEADER_LEGACY, string value: legacy.  Used for allocation class‐
218         es prior to version 1.3 of the library.   Not  recommended  for  use.
219         Incurs  a 64 byte metadata overhead for every object.  Fully supports
220         all features.
221
222       · POBJ_HEADER_COMPACT, string value: compact.  Used as default for  all
223         predefined  allocation  classes.   Incurs a 16 byte metadata overhead
224         for every object.  Fully supports all features.
225
226       · POBJ_HEADER_NONE, string value: none.  Header  type  that  incurs  no
227         metadata overhead beyond a single bitmap entry.  Can be used for very
228         small allocation classes or when objects must  be  adjacent  to  each
229         other.   This  header type does not support type numbers (type number
230         is always
231
232         0) or allocations that span more than one unit.
233
234       The class_id field is an optional, runtime-only  variable  that  allows
235       the user to retrieve the identifier of the class.  This will be equiva‐
236       lent to the provided [class_id].  This field cannot be set from a  con‐
237       fig file.
238
239       The  allocation  classes are a runtime state of the library and must be
240       created after every open.  It is highly recommended to use the configu‐
241       ration file to store the classes.
242
243       This structure is declared in the libpmemobj/ctl.h header file.  Please
244       refer to this file for an in-depth explanation of the allocation class‐
245       es and relevant algorithms.
246
247       Allocation  classes constructed in this way can be leveraged by explic‐
248       itly specifying  the  class  using  POBJ_CLASS_ID(id)  flag  in  pmemo‐
249       bj_tx_xalloc()/pmemobj_xalloc() functions.
250
251       Example of a valid alloc class query string:
252
253              heap.alloc_class.128.desc=500,0,1000,compact
254
255       This  query, if executed, will create an allocation class with an id of
256       128 that has a unit size of 500 bytes, has  at  least  1000  units  per
257       block and uses a compact header.
258
259       For  reading, function returns 0 if successful, if the allocation class
260       does not exist it sets the errno to ENOENT and returns -1;
261
262       For writing, function returns 0 if the allocation class has  been  suc‐
263       cessfully created, -1 otherwise.
264
265       heap.alloc_class.new.desc | -w | - | - | struct pobj_alloc_class_desc |
266       - | integer, integer, string
267
268       Same as heap.alloc_class.[class_id].desc, but instead of requiring  the
269       user  to  provide the class_id, it automatically creates the allocation
270       class with the first available identifier.
271
272       This should be used when it's impossible to guarantee unique allocation
273       class naming in the application (e.g.  when writing a library that uses
274       libpmemobj).
275
276       The required class identifier will be stored in the class_id  field  of
277       the struct pobj_alloc_class_desc.
278
279       This  function  returns 0 if the allocation class has been successfully
280       created, -1 otherwise.
281
282       stats.enabled | rw | - | int | int | - | boolean
283
284       Enables or disables runtime collection of statistics.   Statistics  are
285       not recalculated after enabling; any operations that occur between dis‐
286       abling and re-enabling will not be reflected in subsequent values.
287
288       Statistics are disabled by default.  Enabling them may have non-trivial
289       performance impact.
290
291       Always returns 0.
292
293       stats.heap.curr_allocated | r- | - | int | - | - | -
294
295       Returns  the  number of bytes currently allocated in the heap.  If sta‐
296       tistics were disabled at any time in the lifetime  of  the  heap,  this
297       value may be inaccurate.
298
299       heap.size.granularity | rw- | - | uint64_t | uint64_t | - | long long
300
301       Reads  or  modifies the granularity with which the heap grows when OOM.
302       Valid only if the poolset has been defined with directories.
303
304       A granularity of 0 specifies that the pool will not grow automatically.
305
306       This function returns 0 if the granularity value is  0,  or  is  larger
307       than PMEMOBJ_MIN_PART, -1 otherwise.
308
309       heap.size.extend | --x | - | - | - | uint64_t | -
310
311       Extends  the  heap  by  the  given  size.   Must  be larger than PMEMO‐
312       BJ_MIN_PART.
313
314       This function returns 0 if successful, -1 otherwise.
315

CTL EXTERNAL CONFIGURATION

317       In addition to direct function call, each write entry point can also be
318       set using two alternative methods.
319
320       The  first  method  is to load a configuration directly from the PMEMO‐
321       BJ_CONF environment variable.  A properly formatted ctl  config  string
322       is a single-line sequence of queries separated by ';':
323
324              query0;query1;...;queryN
325
326       A  single  query  is  constructed  from the name of the ctl write entry
327       point and the argument, separated by '=':
328
329              entry_point=entry_point_argument
330
331       The entry point argument type is defined by the entry point itself, but
332       there are three predefined primitives:
333
334              *) integer: represented by a sequence of [0-9] characters that form
335                  a single number.
336              *) boolean: represented by a single character: y/n/Y/N/0/1, each
337                  corresponds to true or false. If the argument contains any
338                  trailing characters, they are ignored.
339              *) string: a simple sequence of characters.
340
341       There  are  also complex argument types that are formed from the primi‐
342       tives separated by a ',':
343
344              first_arg,second_arg
345
346       In summary, a full configuration sequence looks like this:
347
348              (first_entry_point)=(arguments, ...);...;(last_entry_point)=(arguments, ...);
349
350       As an example, to set both prefault at_open and at_create variables:
351
352              PMEMOBJ_CONF="prefault.at_open=1;prefault.at_create=1"
353
354       The second method of loading an external configuration is  to  set  the
355       PMEMOBJ_CONF_FILE environment variable to point to a file that contains
356       a sequence of ctl queries.  The parsing rules are all the same, but the
357       file can also contain white-spaces and comments.
358
359       To  create  a comment, simply use '#' anywhere in a line and everything
360       afterwards, until a new line '', will be ignored.
361
362       An example configuration file:
363
364              #########################
365              # My pmemobj configuration
366              #########################
367              #
368              # Global settings:
369              prefault. # modify the behavior of pre-faulting
370                  at_open = 1; # prefault when the pool is opened
371
372              prefault.
373                  at_create = 0; # but don't prefault when it's created
374
375              # Per-pool settings:
376              # ...
377

SEE ALSO

379       libpmemobj(7) and <http://pmem.io>
380
381
382
383PMDK - pmemobj API version 2.3    2018-03-13                PMEMOBJ_CTL_GET(3)
Impressum