1PMEMOBJ_TX_BEGIN(3) PMDK Programmer's Manual PMEMOBJ_TX_BEGIN(3)
2
3
4
6 pmemobj_tx_stage(),
7
8 pmemobj_tx_begin(), pmemobj_tx_lock(), pmemobj_tx_abort(), pmemo‐
9 bj_tx_commit(), pmemobj_tx_end(), pmemobj_tx_errno(), pmemo‐
10 bj_tx_process(),
11
12 TX_BEGIN_PARAM(), TX_BEGIN_CB(), TX_BEGIN(), TX_ONABORT, TX_ONCOMMIT,
13 TX_FINALLY, TX_END -- transactional object manipulation
14
16 #include <libpmemobj.h>
17
18 enum tx_stage pmemobj_tx_stage(void);
19
20 int pmemobj_tx_begin(PMEMobjpool *pop, jmp_buf *env, enum pobj_tx_param, ...);
21 int pmemobj_tx_lock(enum tx_lock lock_type, void *lockp);
22 void pmemobj_tx_abort(int errnum);
23 void pmemobj_tx_commit(void);
24 int pmemobj_tx_end(void);
25 int pmemobj_tx_errno(void);
26 void pmemobj_tx_process(void);
27
28 TX_BEGIN_PARAM(PMEMobjpool *pop, ...)
29 TX_BEGIN_CB(PMEMobjpool *pop, cb, arg, ...)
30 TX_BEGIN(PMEMobjpool *pop)
31 TX_ONABORT
32 TX_ONCOMMIT
33 TX_FINALLY
34 TX_END
35
37 The non-transactional functions and macros described in pmemobj_al‐
38 loc(3), pmemobj_list_insert(3) and POBJ_LIST_HEAD(3) only guarantee the
39 atomicity of a single operation on an object. In case of more complex
40 changes involving multiple operations on an object, or allocation and
41 modification of multiple objects, data consistency and fail-safety may
42 be provided only by using atomic transactions.
43
44 A transaction is defined as series of operations on persistent memory
45 objects that either all occur, or nothing occurs. In particular, if
46 the execution of a transaction is interrupted by a power failure or a
47 system crash, it is guaranteed that after system restart, all the
48 changes made as a part of the uncompleted transaction will be rolled
49 back, restoring the consistent state of the memory pool from the moment
50 when the transaction was started.
51
52 Note that transactions do not provide atomicity with respect to other
53 threads. All the modifications performed within the transactions are
54 immediately visible to other threads. Therefore it is the responsibil‐
55 ity of the application to implement a proper thread synchronization
56 mechanism.
57
58 Each thread may have only one transaction open at a time, but that
59 transaction may be nested. Nested transactions are flattened. Commit‐
60 ting the nested transaction does not commit the outer transaction; how‐
61 ever, errors in the nested transaction are propagated up to the outer‐
62 most level, resulting in the interruption of the entire transaction.
63
64 Each transaction is visible only for the thread that started it. No
65 other threads can add operations, commit or abort the transaction ini‐
66 tiated by another thread. Multiple threads may have transactions open
67 on a given memory pool at the same time.
68
69 Please see the CAVEATS section below for known limitations of the
70 transactional API.
71
72 The pmemobj_tx_stage() function returns the current transaction stage
73 for a thread. Stages are changed only by the pmemobj_tx_*() functions.
74 Transaction stages are defined as follows:
75
76 · TX_STAGE_NONE - no open transaction in this thread
77
78 · TX_STAGE_WORK - transaction in progress
79
80 · TX_STAGE_ONCOMMIT - successfully committed
81
82 · TX_STAGE_ONABORT - starting the transaction failed or transaction
83 aborted
84
85 · TX_STAGE_FINALLY - ready for clean up
86
87 The pmemobj_tx_begin() function starts a new transaction in the current
88 thread. If called within an open transaction, it starts a nested
89 transaction. The caller may use the env argument to provide a pointer
90 to a calling environment to be restored in case of transaction abort.
91 This information must be provided by the caller using the setjmp(3)
92 macro.
93
94 A new transaction may be started only if the current stage is
95 TX_STAGE_NONE or TX_STAGE_WORK. If successful, the transaction stage
96 changes to TX_STAGE_WORK. Otherwise, the stage is changed to
97 TX_STAGE_ONABORT.
98
99 Optionally, a list of parameters for the transaction may be provided.
100 Each parameter consists of a type followed by a type-specific number of
101 values. Currently there are 4 types:
102
103 · TX_PARAM_NONE, used as a termination marker. No following value.
104
105 · TX_PARAM_MUTEX, followed by one value, a pmem-resident PMEMmutex
106
107 · TX_PARAM_RWLOCK, followed by one value, a pmem-resident PMEMrwlock
108
109 · TX_PARAM_CB, followed by two values: a callback function of type
110 pmemobj_tx_callback, and a void pointer
111
112 Using TX_PARAM_MUTEX or TX_PARAM_RWLOCK causes the specified lock to be
113 acquired at the beginning of the transaction. TX_PARAM_RWLOCK acquires
114 the lock for writing. It is guaranteed that pmemobj_tx_begin() will
115 acquire all locks prior to successful completion, and they will be held
116 by the current thread until the outermost transaction is finished.
117 Locks are taken in order from left to right. To avoid deadlocks, the
118 user is responsible for proper lock ordering.
119
120 TX_PARAM_CB registers the specified callback function to be executed at
121 each transaction stage. For TX_STAGE_WORK, the callback is executed
122 prior to commit. For all other stages, the callback is executed as the
123 first operation after a stage change. It will also be called after
124 each transaction; in this case the stage parameter will be set to
125 TX_STAGE_NONE. pmemobj_tx_callback must be compatible with:
126
127 void func(PMEMobjpool *pop, enum pobj_tx_stage stage, void *arg)
128
129 pop is a pool identifier used in pmemobj_tx_begin(), stage is a current
130 transaction stage and arg is the second parameter of TX_PARAM_CB.
131 Without considering transaction nesting, this mechanism can be consid‐
132 ered an alternative method for executing code between stages (instead
133 of TX_ONCOMMIT, TX_ONABORT, etc). However, there are 2 significant
134 differences when nested transactions are used:
135
136 · The registered function is executed only in the outermost transac‐
137 tion, even if registered in an inner transaction.
138
139 · There can be only one callback in the entire transaction, that is,
140 the callback cannot be changed in an inner transaction.
141
142 Note that TX_PARAM_CB does not replace the TX_ONCOMMIT, TX_ONABORT,
143 etc. macros. They can be used together: the callback will be executed
144 before a TX_ONCOMMIT, TX_ONABORT, etc. section.
145
146 TX_PARAM_CB can be used when the code dealing with transaction stage
147 changes is shared between multiple users or when it must be executed
148 only in the outer transaction. For example it can be very useful when
149 the application must synchronize persistent and transient state.
150
151 The pmemobj_tx_lock() function acquires the lock lockp of type
152 lock_type and adds it to the current transaction. lock_type may be
153 TX_LOCK_MUTEX or TX_LOCK_RWLOCK; lockp must be of type PMEMmutex or
154 PMEMrwlock, respectively. If lock_type is TX_LOCK_RWLOCK the lock is
155 acquired for writing. If the lock is not successfully acquired, the
156 stage is changed to TX_STAGE_ONABORT. This function must be called
157 during TX_STAGE_WORK.
158
159 pmemobj_tx_abort() aborts the current transaction and causes a transi‐
160 tion to TX_STAGE_ONABORT. If errnum is equal to 0, the transaction er‐
161 ror code is set to ECANCELED; otherwise, it is set to errnum. This
162 function must be called during TX_STAGE_WORK.
163
164 The pmemobj_tx_commit() function commits the current open transaction
165 and causes a transition to TX_STAGE_ONCOMMIT. If called in the context
166 of the outermost transaction, all the changes may be considered as
167 durably written upon successful completion. This function must be
168 called during TX_STAGE_WORK.
169
170 The pmemobj_tx_end() function performs a cleanup of the current trans‐
171 action. If called in the context of the outermost transaction, it re‐
172 leases all the locks acquired by pmemobj_tx_begin() for outer and nest‐
173 ed transactions. If called in the context of a nested transaction, it
174 returns to the context of the outer transaction in TX_STAGE_WORK, with‐
175 out releasing any locks. The pmemobj_tx_end() function can be called
176 during TX_STAGE_NONE if transitioned to this stage using pmemo‐
177 bj_tx_process(). If not already in TX_STAGE_NONE, it causes the tran‐
178 sition to TX_STAGE_NONE. pmemobj_tx_end must always be called for each
179 pmemobj_tx_begin(), even if starting the transaction failed. This
180 function must not be called during TX_STAGE_WORK.
181
182 The pmemobj_tx_errno() function returns the error code of the last
183 transaction.
184
185 The pmemobj_tx_process() function performs the actions associated with
186 the current stage of the transaction, and makes the transition to the
187 next stage. It must be called in a transaction. The current stage
188 must always be obtained by a call to pmemobj_tx_stage(). pmemo‐
189 bj_tx_process() performs the following transitions in the transaction
190 stage flow:
191
192 · TX_STAGE_WORK -> TX_STAGE_ONCOMMIT
193
194 · TX_STAGE_ONABORT -> TX_STAGE_FINALLY
195
196 · TX_STAGE_ONCOMMIT -> TX_STAGE_FINALLY
197
198 · TX_STAGE_FINALLY -> TX_STAGE_NONE
199
200 · TX_STAGE_NONE -> TX_STAGE_NONE
201
202 pmemobj_tx_process() must not be called after calling pmemobj_tx_end()
203 for the outermost transaction.
204
205 In addition to the above API, libpmemobj(7) offers a more intuitive
206 method of building transactions using the set of macros described be‐
207 low. When using these macros, the complete transaction flow looks like
208 this:
209
210 TX_BEGIN(Pop) {
211 /* the actual transaction code goes here... */
212 } TX_ONCOMMIT {
213 /*
214 * optional - executed only if the above block
215 * successfully completes
216 */
217 } TX_ONABORT {
218 /*
219 * optional - executed only if starting the transaction fails,
220 * or if transaction is aborted by an error or a call to
221 * pmemobj_tx_abort()
222 */
223 } TX_FINALLY {
224 /*
225 * optional - if exists, it is executed after
226 * TX_ONCOMMIT or TX_ONABORT block
227 */
228 } TX_END /* mandatory */
229
230 TX_BEGIN_PARAM(PMEMobjpool *pop, ...)
231 TX_BEGIN_CB(PMEMobjpool *pop, cb, arg, ...)
232 TX_BEGIN(PMEMobjpool *pop)
233
234 The TX_BEGIN_PARAM(), TX_BEGIN_CB() and TX_BEGIN() macros start a new
235 transaction in the same way as pmemobj_tx_begin(), except that instead
236 of the environment buffer provided by a caller, they set up the local
237 jmp_buf buffer and use it to catch the transaction abort. The TX_BE‐
238 GIN() macro starts a transaction without any options. TX_BEGIN_PARAM
239 may be used when there is a need to acquire locks prior to starting a
240 transaction (such as for a multi-threaded program) or set up a transac‐
241 tion stage callback. TX_BEGIN_CB is just a wrapper around TX_BE‐
242 GIN_PARAM that validates the callback signature. (For compatibility
243 there is also a TX_BEGIN_LOCK macro, which is an alias for TX_BE‐
244 GIN_PARAM). Each of these macros must be followed by a block of code
245 with all the operations that are to be performed atomically.
246
247 The TX_ONABORT macro starts a block of code that will be executed only
248 if starting the transaction fails due to an error in pmemobj_tx_be‐
249 gin(), or if the transaction is aborted. This block is optional, but
250 in practice it should not be omitted. If it is desirable to crash the
251 application when a transaction aborts and there is no TX_ONABORT sec‐
252 tion, the application can define the POBJ_TX_CRASH_ON_NO_ONABORT macro
253 before inclusion of <libpmemobj.h>. This provides a default TX_ONABORT
254 section which just calls abort(3).
255
256 The TX_ONCOMMIT macro starts a block of code that will be executed only
257 if the transaction is successfully committed, which means that the exe‐
258 cution of code in the TX_BEGIN() block has not been interrupted by an
259 error or by a call to pmemobj_tx_abort(). This block is optional.
260
261 The TX_FINALLY macro starts a block of code that will be executed re‐
262 gardless of whether the transaction is committed or aborted. This
263 block is optional.
264
265 The TX_END macro cleans up and closes the transaction started by the
266 TX_BEGIN() / TX_BEGIN_PARAM() / TX_BEGIN_CB() macros. It is mandatory
267 to terminate each transaction with this macro. If the transaction was
268 aborted, errno is set appropriately.
269
271 The pmemobj_tx_stage() function returns the stage of the current trans‐
272 action stage for a thread.
273
274 On success, pmemobj_tx_begin() returns 0. Otherwise, an error number
275 is returned.
276
277 The pmemobj_tx_begin() and pmemobj_tx_lock() functions return zero if
278 lockp is successfully added to the transaction. Otherwise, an error
279 number is returned.
280
281 The pmemobj_tx_abort() and pmemobj_tx_commit() functions return no val‐
282 ue.
283
284 The pmemobj_tx_end() function returns 0 if the transaction was success‐
285 ful. Otherwise it returns the error code set by pmemobj_tx_abort().
286 Note that pmemobj_tx_abort() can be called internally by the library.
287
288 The pmemobj_tx_errno() function returns the error code of the last
289 transaction.
290
291 The pmemobj_tx_process() function returns no value.
292
294 Transaction flow control is governed by the setjmp(3) and longjmp(3)
295 macros, and they are used in both the macro and function flavors of the
296 API. The transaction will longjmp on transaction abort. This has one
297 major drawback, which is described in the ISO C standard subsection
298 7.13.2.1. It says that the values of objects of automatic storage du‐
299 ration that are local to the function containing the setjmp invocation
300 that do not have volatile-qualified type and have been changed between
301 the setjmp invocation and longjmp call are indeterminate.
302
303 The following example illustrates the issue described above.
304
305 int *bad_example_1 = (int *)0xBAADF00D;
306 int *bad_example_2 = (int *)0xBAADF00D;
307 int *bad_example_3 = (int *)0xBAADF00D;
308 int * volatile good_example = (int *)0xBAADF00D;
309
310 TX_BEGIN(pop) {
311 bad_example_1 = malloc(sizeof(int));
312 bad_example_2 = malloc(sizeof(int));
313 bad_example_3 = malloc(sizeof(int));
314 good_example = malloc(sizeof(int));
315
316 /* manual or library abort called here */
317 pmemobj_tx_abort(EINVAL);
318 } TX_ONCOMMIT {
319 /*
320 * This section is longjmp-safe
321 */
322 } TX_ONABORT {
323 /*
324 * This section is not longjmp-safe
325 */
326 free(good_example); /* OK */
327 free(bad_example_1); /* undefined behavior */
328 } TX_FINALLY {
329 /*
330 * This section is not longjmp-safe on transaction abort only
331 */
332 free(bad_example_2); /* undefined behavior */
333 } TX_END
334
335 free(bad_example_3); /* undefined behavior */
336
337 Objects which are not volatile-qualified, are of automatic storage du‐
338 ration and have been changed between the invocations of setjmp(3) and
339 longjmp(3) (that also means within the work section of the transaction
340 after TX_BEGIN()) should not be used after a transaction abort, or
341 should be used with utmost care. This also includes code after the
342 TX_END macro.
343
344 libpmemobj(7) is not cancellation-safe. The pool will never be cor‐
345 rupted because of a canceled thread, but other threads may stall wait‐
346 ing on locks taken by that thread. If the application wants to use
347 pthread_cancel(3), it must disable cancellation before calling any
348 libpmemobj(7) APIs (see pthread_setcancelstate(3) with PTHREAD_CAN‐
349 CEL_DISABLE), and re-enable it afterwards. Deferring cancellation
350 (pthread_setcanceltype(3) with PTHREAD_CANCEL_DEFERRED) is not safe
351 enough, because libpmemobj(7) internally may call functions that are
352 specified as cancellation points in POSIX.
353
354 libpmemobj(7) relies on the library destructor being called from the
355 main thread. For this reason, all functions that might trigger de‐
356 struction (e.g. dlclose(3)) should be called in the main thread. Oth‐
357 erwise some of the resources associated with that thread might not be
358 cleaned up properly.
359
361 dlclose(3), longjmp(3), pmemobj_tx_add_range(3), pmemobj_tx_alloc(3),
362 pthread_setcancelstate(3), pthread_setcanceltype(3), setjmp(3), libp‐
363 memobj(7) and <http://pmem.io>
364
365
366
367PMDK - pmemobj API version 2.3 2018-03-13 PMEMOBJ_TX_BEGIN(3)