1LIBUNWIND-DYNAMIC(3) Programming Library LIBUNWIND-DYNAMIC(3)
2
3
4
6 libunwind-dynamic -- libunwind-support for runtime-generated code
7
9 For libunwind to do its job, it needs to be able to reconstruct the
10 frame state of each frame in a call-chain. The frame state describes
11 the subset of the machine-state that consists of the frame registers
12 (typically the instruction-pointer and the stack-pointer) and all
13 callee-saved registers (preserved registers). The frame state
14 describes each register either by providing its current value (for
15 frame registers) or by providing the location at which the current
16 value is stored (callee-saved registers).
17
18 For statically generated code, the compiler normally takes care of
19 emitting unwind-info which provides the minimum amount of information
20 needed to reconstruct the frame-state for each instruction in a proce‐
21 dure. For dynamically generated code, the runtime code generator must
22 use the dynamic unwind-info interface provided by libunwind to supply
23 the equivalent information. This manual page describes the format of
24 this information in detail.
25
26 For the purpose of this discussion, a procedure is defined to be an
27 arbitrary piece of contiguous code. Normally, each procedure directly
28 corresponds to a function in the source-language but this is not
29 strictly required. For example, a runtime code-generator could trans‐
30 late a given function into two separate (discontiguous) procedures: one
31 for frequently-executed (hot) code and one for rarely-executed (cold)
32 code. Similarly, simple source-language functions (usually leaf func‐
33 tions) may get translated into code for which the default unwind-con‐
34 ventions apply and for such code, it is not strictly necessary to reg‐
35 ister dynamic unwind-info.
36
37 A procedure logically consists of a sequence of regions. Regions are
38 nested in the sense that the frame state at the end of one region is,
39 by default, assumed to be the frame state for the next region. Each
40 region is thought of as being divided into a prologue, a body, and an
41 epilogue. Each of them can be empty. If non-empty, the prologue sets
42 up the frame state for the body. For example, the prologue may need to
43 allocate some space on the stack and save certain callee-saved regis‐
44 ters. The body performs the actual work of the procedure but does not
45 change the frame state in any way. If non-empty, the epilogue restores
46 the previous frame state and as such it undoes or cancels the effect of
47 the prologue. In fact, a single epilogue may undo the effect of the
48 prologues of several (nested) regions.
49
50 We should point out that even though the prologue, body, and epilogue
51 are logically separate entities, optimizing code-generators will gener‐
52 ally interleave instructions from all three entities. For this reason,
53 the dynamic unwind-info interface of libunwind makes no distinction
54 whatsoever between prologue and body. Similarly, the exact set of
55 instructions that make up an epilogue is also irrelevant. The only
56 point in the epilogue that needs to be described explicitly by the
57 dynamic unwind-info is the point at which the stack-pointer gets
58 restored. The reason this point needs to be described is that once the
59 stack-pointer is restored, all values saved in the deallocated portion
60 of the stack frame become invalid and hence libunwind needs to know
61 about it. The portion of the frame state not saved on the stack is
62 assume to remain valid through the end of the region. For this reason,
63 there is usually no need to describe instructions which restore the
64 contents of callee-saved registers.
65
66 Within a region, each instruction that affects the frame state in some
67 fashion needs to be described with an operation descriptor. For this
68 purpose, each instruction in the region is assigned a unique index.
69 Exactly how this index is derived depends on the architecture. For
70 example, on RISC and EPIC-style architecture, instructions have a fixed
71 size so it's possible to simply number the instructions. In contrast,
72 most CISC use variable-length instruction encodings, so it is usually
73 necessary to use a byte-offset as the index. Given the instruction
74 index, the operation descriptor specifies the effect of the instruction
75 in an abstract manner. For example, it might express that the instruc‐
76 tion stores calle-saved register r1 at offset 16 in the stack frame.
77
79 A runtime code-generator registers the dynamic unwind-info of a proce‐
80 dure by setting up a structure of type unw_dyn_info_t and calling
81 _U_dyn_register(), passing the address of the structure as the sole
82 argument. The members of the unw_dyn_info_t structure are described
83 below:
84
85 void *next
86 Private to libunwind. Must not be used by the application.
87
88 void *prev
89 Private to libunwind. Must not be used by the application.
90
91 unw_word_t start_ip
92 The start-address of the instructions of the procedure (remem‐
93 ber: procedure are defined to be contiguous pieces of code, so a
94 single code-range is sufficient).
95
96 unw_word_t end_ip
97 The end-address of the instructions of the procedure
98 (non-inclusive, that is, end_ip-start_ip is the size of the pro‐
99 cedure in bytes).
100
101 unw_word_t gp
102 The global-pointer value in use for this procedure. The exact
103 meaing of the global-pointer is architecture-specific and on
104 some architecture, it is not used at all.
105
106 int32_t format
107 The format of the unwind-info. This member can be one of
108 UNW_INFO_FORMAT_DYNAMIC, UNW_INFO_FORMAT_TABLE, or UNW_INFO_FOR‐
109 MAT_REMOTE_TABLE.
110
111 union u
112 This union contains one sub-member structure for every possible
113 unwind-info format:
114
115 unw_dyn_proc_info_t pi
116 This member is used for format UNW_INFO_FORMAT_DYNAMIC.
117
118 unw_dyn_table_info_t ti
119 This member is used for format UNW_INFO_FORMAT_TABLE.
120
121 unw_dyn_remote_table_info_t rti
122 This member is used for format UNW_INFO_FOR‐
123 MAT_REMOTE_TABLE.
124
125 The format of these sub-members is described in detail below.
126
127 PROC-INFO FORMAT
128 This is the preferred dynamic unwind-info format and it is generally
129 the one used by full-blown runtime code-generators. In this format, the
130 details of a procedure are described by a structure of type
131 unw_dyn_proc_info_t. This structure contains the following members:
132
133 unw_word_t name_ptr
134 The address of a (human-readable) name of the procedure or 0 if
135 no such name is available. If non-zero, The string stored at
136 this address must be ASCII NUL terminated. For source languages
137 that use name-mangling (such as C++ or Java) the string stored
138 at this address should be the demangled version of the name.
139
140 unw_word_t handler
141 The address of the personality-routine for this procedure. Per‐
142 sonality-routines are used in conjunction with exception han‐
143 dling. See the C++ ABI draft (http://www.codes‐
144 ourcery.com/cxx-abi/) for an overview and a description of the
145 personality routine. If the procedure has no personality rou‐
146 tine, handler must be set to 0.
147
148 uint32_t flags
149 A bitmask of flags. At the moment, no flags have been defined
150 and this member must be set to 0.
151
152 unw_dyn_region_info_t *regions
153 A NULL-terminated linked list of region-descriptors. See sec‐
154 tion ``Region descriptors'' below for more details.
155
156 TABLE-INFO FORMAT
157 This format is generally used when the dynamically generated code was
158 derived from static code and the unwind-info for the dynamic and the
159 static versions is identical. For example, this format can be useful
160 when loading statically-generated code into an address-space in a
161 non-standard fashion (i.e., through some means other than dlopen()).
162 In this format, the details of a group of procedures is described by a
163 structure of type unw_dyn_table_info. This structure contains the fol‐
164 lowing members:
165
166 unw_word_t name_ptr
167 The address of a (human-readable) name of the procedure or 0 if
168 no such name is available. If non-zero, The string stored at
169 this address must be ASCII NUL terminated. For source languages
170 that use name-mangling (such as C++ or Java) the string stored
171 at this address should be the demangled version of the name.
172
173 unw_word_t segbase
174 The segment-base value that needs to be added to the seg‐
175 ment-relative values stored in the unwind-info. The exact mean‐
176 ing of this value is architecture-specific.
177
178 unw_word_t table_len
179 The length of the unwind-info (table_data) counted in units of
180 words (unw_word_t).
181
182 unw_word_t table_data
183 A pointer to the actual data encoding the unwind-info. The
184 exact format is architecture-specific (see architecture-specific
185 sections below).
186
187 REMOTE TABLE-INFO FORMAT
188 The remote table-info format has the same basic purpose as the regular
189 table-info format. The only difference is that when libunwind uses the
190 unwind-info, it will keep the table data in the target address-space
191 (which may be remote). Consequently, the type of the table_data member
192 is unw_word_t rather than a pointer. This implies that libunwind will
193 have to access the table-data via the address-space's access_mem()
194 call-back, rather than through a direct memory reference.
195
196 From the point of view of a runtime-code generator, the remote ta‐
197 ble-info format offers no advantage and it is expected that such gener‐
198 ators will describe their procedures either with the proc-info format
199 or the normal table-info format. The main reason that the remote ta‐
200 ble-info format exists is to enable the address-space-specific
201 find_proc_info() callback (see unw_create_addr_space(3)) to return
202 unwind tables whose data remains in remote memory. This can speed up
203 unwinding (e.g., for a debugger) because it reduces the amount of data
204 that needs to be loaded from remote memory.
205
207 A region descriptor is a variable length structure that describes how
208 each instruction in the region affects the frame state. Of course, most
209 instructions in a region usualy do not change the frame state and for
210 those, nothing needs to be recorded in the region descriptor. A region
211 descriptor is a structure of type unw_dyn_region_info_t and has the
212 following members:
213
214 unw_dyn_region_info_t *next
215 A pointer to the next region. If this is the last region, next
216 is NULL.
217
218 int32_t insn_count
219 The length of the region in instructions. Each instruction is
220 assumed to have a fixed size (see architecture-specific sections
221 for details). The value of insn_count may be negative in the
222 last region of a procedure (i.e., it may be negative only if
223 next is NULL). A negative value indicates that the region cov‐
224 ers the last N instructions of the procedure, where N is the
225 absolute value of insn_count.
226
227 uint32_t op_count
228 The (allocated) length of the op_count array.
229
230 unw_dyn_op_t op
231 An array of dynamic unwind directives. See Section ``Dynamic
232 unwind directives'' for a description of the directives.
233
234 A region descriptor with an insn_count of zero is an empty region and
235 such regions are perfectly legal. In fact, empty regions can be useful
236 to establish a particular frame state before the start of another
237 region.
238
239 A single region list can be shared across multiple procedures provided
240 those procedures share a common prologue and epilogue (their bodies may
241 differ, of course). Normally, such procedures consist of a canned pro‐
242 logue, the body, and a canned epilogue. This could be described by two
243 regions: one covering the prologue and one covering the epilogue.
244 Since the body length is variable, the latter region would need to
245 specify a negative value in insn_count such that libunwind knows that
246 the region covers the end of the procedure (up to the address specified
247 by end_ip).
248
249 The region descriptor is a variable length structure to make it possi‐
250 ble to allocate all the necessary memory with a single memory-alloca‐
251 tion request. To facilitate the allocation of a region descriptors
252 libunwind provides a helper routine with the following synopsis:
253
254 size_t _U_dyn_region_size(int op_count);
255
256 This routine returns the number of bytes needed to hold a region
257 descriptor with space for op_count unwind directives. Note that the
258 length of the op array does not have to match exactly with the number
259 of directives in a region. Instead, it is sufficient if the op array
260 contains at least as many entries as there are directives, since the
261 end of the directives can always be indicated with the UNW_DYN_STOP
262 directive.
263
265 A dynamic unwind directive describes how the frame state changes at a
266 particular point within a region. The description is in the form of a
267 structure of type unw_dyn_op_t. This structure has the following mem‐
268 bers:
269
270 int8_t tag
271 The operation tag. Must be one of the unw_dyn_operation_t val‐
272 ues described below.
273
274 int8_t qp
275 The qualifying predicate that controls whether or not this
276 directive is active. This is useful for predicated architecturs
277 such as IA-64 or ARM, where the contents of another
278 (callee-saved) register determines whether or not an instruction
279 is executed (takes effect). If the directive is always active,
280 this member should be set to the manifest constant _U_QP_TRUE
281 (this constant is defined for all architectures, predicated or
282 not).
283
284 int16_t reg
285 The number of the register affected by the instruction.
286
287 int32_t when
288 The region-relative number of the instruction to which this
289 directive applies. For example, a value of 0 means that the
290 effect described by this directive has taken place once the
291 first instruction in the region has executed.
292
293 unw_word_t val
294 The value to be applied by the operation tag. The exact meaning
295 of this value varies by tag. See Section ``Operation tags''
296 below.
297
298 It is perfectly legitimate to specify multiple dynamic unwind direc‐
299 tives with the same when value, if a particular instruction has a com‐
300 plex effect on the frame state.
301
302 Empty regions by definition contain no actual instructions and as such
303 the directives are not tied to a particular instruction. By convention,
304 the when member should be set to 0, however.
305
306 There is no need for the dynamic unwind directives to appear in order
307 of increasing when values. If the directives happen to be sorted in
308 that order, it may result in slightly faster execution, but a runtime
309 code-generator should not go to extra lengths just to ensure that the
310 directives are sorted.
311
312 IMPLEMENTATION NOTE: should libunwind implementations for certain
313 architectures prefer the list of unwind directives to be sorted, it is
314 recommended that such implementations first check whether the list hap‐
315 pens to be sorted already and, if not, sort the directives explicitly
316 before the first use. With this approach, the overhead of explicit
317 sorting is only paid when there is a real benefit and if the runtime
318 code-generator happens to generated sorted lists naturally, the perfor‐
319 mance penalty is limited to a simple O(N) check.
320
321 OPERATIONS TAGS
322 The possible operation tags are defined by enumeration type
323 unw_dyn_operation_t which defines the following values:
324
325 UNW_DYN_STOP
326 Marks the end of the dynamic unwind directive list. All remain‐
327 ing entries in the op array of the region-descriptor are
328 ignored. This tag is guaranteed to have a value of 0.
329
330 UNW_DYN_SAVE_REG
331 Marks an instruction which saves register reg to register val.
332
333 UNW_DYN_SPILL_FP_REL
334 Marks an instruction which spills register reg to a
335 frame-pointer-relative location. The frame-pointer-relative off‐
336 set is given by the value stored in member val. See the archi‐
337 tecture-specific sections for a description of the stack frame
338 layout.
339
340 UNW_DYN_SPILL_SP_REL
341 Marks an instruction which spills register reg to a
342 stack-pointer-relative location. The stack-pointer-relative off‐
343 set is given by the value stored in member val. See the archi‐
344 tecture-specific sections for a description of the stack frame
345 layout.
346
347 UNW_DYN_ADD
348 Marks an instruction which adds the constant value val to reg‐
349 ister reg. To add subtract a constant value, store the
350 two's-complement of the value in val. The set of registers that
351 can be specified for this tag is described in the architec‐
352 ture-specific sections below.
353
354 UNW_DYN_POP_FRAMES
355 .PP
356
357 UNW_DYN_LABEL_STATE
358 .PP
359
360 UNW_DYN_COPY_STATE
361 .PP
362
363 UNW_DYN_ALIAS
364 .PP unw_dyn_op_t
365
366 _U_dyn_op_save_reg(); _U_dyn_op_spill_fp_rel();
367 _U_dyn_op_spill_sp_rel(); _U_dyn_op_add(); _U_dyn_op_pop_frames();
368 _U_dyn_op_label_state(); _U_dyn_op_copy_state(); _U_dyn_op_alias();
369 _U_dyn_op_stop();
370
372 - meaning of segbase member in table-info/table-remote-info format -
373 format of table_data in table-info/table-remote-info format - instruc‐
374 tion size: each bundle is counted as 3 instructions, regardless of tem‐
375 plate (MLX) - describe stack-frame layout, especially with regards to
376 sp-relative and fp-relative addressing - UNW_DYN_ADD can only add to
377 ``sp'' (always a negative value); use POP_FRAMES otherwise
378
380 libunwind(3), _U_dyn_register(3), _U_dyn_cancel(3)
381
383 David Mosberger-Tang
384 Hewlett-Packard Labs
385 Palo-Alto, CA 94304
386 Email: davidm@hpl.hp.com
387 WWW: http://www.hpl.hp.com/research/linux/libunwind/.
388
389
390
391Programming Library 05 August 2004 LIBUNWIND-DYNAMIC(3)