1YASM_ARCH(7)             Yasm Supported Architectures             YASM_ARCH(7)
2
3
4

NAME

6       yasm_arch - Yasm Supported Target Architectures
7

SYNOPSIS

9       yasm -a arch [-m machine] ...
10

DESCRIPTION

12       The standard Yasm distribution includes a number of modules for
13       different target architectures. Each target architecture can support
14       one or more machine architectures.
15
16       The architecture and machine are selected on the yasm(1) command line
17       by use of the -a arch and -m machine command line options,
18       respectively.
19
20       The machine architecture may also automatically be selected by certain
21       object formats. For example, the “elf32” object format selects the
22       “x86” machine architecture by default, while the “elf64” object format
23       selects the “amd64” machine architecture by default.
24

X86 ARCHITECTURE

26       The “x86” architecture supports the IA-32 instruction set and
27       derivatives and the AMD64 instruction set. It consists of two machines:
28       “x86” (for the IA-32 and derivatives) and “amd64” (for the AMD64 and
29       derivatives). The default machine for the “x86” architecture is the
30       “x86” machine.
31
32   BITS Setting
33       The x86 architecture BITS setting specifies to Yasm the processor mode
34       in which the generated code is intended to execute. x86 processors can
35       run in three different major execution modes: 16-bit, 32-bit, and on
36       AMD64-supporting processors, 64-bit. As the x86 instruction set
37       contains portions whose function is execution-mode dependent (such as
38       operand-size and address-size override prefixes), Yasm cannot assemble
39       x86 instructions correctly unless it is told by the user in what
40       processor mode the code will execute.
41
42       The BITS setting can be changed in a variety of ways. When using the
43       NASM-compatible parser, the BITS setting can be changed directly via
44       the use of the BITS xx assembler directive. The default BITS setting is
45       determined by the object format in use.
46
47   BITS 64 Extensions
48       The AMD64 architecture is a new 64-bit architecture developed by AMD,
49       based on the 32-bit x86 architecture. It extends the original x86
50       architecture by doubling the number of general purpose and SIMD
51       registers, extending the arithmetic operations and address space to 64
52       bits, as well as other features.
53
54       Recently, Intel has introduced an essentially identical version of
55       AMD64 called EM64T.
56
57       When an AMD64-supporting processor is executing in 64-bit mode, a
58       number of additional extensions are available, including extra general
59       purpose registers, extra SSE2 registers, and RIP-relative addressing.
60
61       Yasm extends the base NASM syntax to support AMD64 as follows. To
62       enable assembly of instructions for the 64-bit mode of AMD64
63       processors, use the directive BITS 64. As with NASM's BITS directive,
64       this does not change the format of the output object file to 64 bits;
65       it only changes the assembler mode to assume that the instructions
66       being assembled will be run in 64-bit mode. To specify an AMD64 object
67       file, use -m amd64 on the Yasm command line, or explicitly target a
68       64-bit object format such as -f win64 or -f elf64.  -f elfx32 can be
69       used to select 32-bit ELF object format for AMD64 processors.
70
71       Register Changes
72           The additional 64-bit general purpose registers are named r8-r15.
73           There are also 8-bit (rXb), 16-bit (rXw), and 32-bit (rXd)
74           subregisters that map to the least significant 8, 16, or 32 bits of
75           the 64-bit register. The original 8 general purpose registers have
76           also been extended to 64-bits: eax, edx, ecx, ebx, esi, edi, esp,
77           and ebp have new 64-bit versions called rax, rdx, rcx, rbx, rsi,
78           rdi, rsp, and rbp respectively. The old 32-bit registers map to the
79           least significant bits of the new 64-bit registers.
80
81           New 8-bit registers are also available that map to the 8 least
82           significant bits of rsi, rdi, rsp, and rbp. These are called sil,
83           dil, spl, and bpl respectively. Unfortunately, due to the way
84           instructions are encoded, these new 8-bit registers are encoded the
85           same as the old 8-bit registers ah, dh, ch, and bh. The processor
86           tells which is being used by the presence of the new REX prefix
87           that is used to specify the other extended registers. This means it
88           is illegal to mix the use of ah, dh, ch, and bh with an instruction
89           that requires the REX prefix for other reasons. For instance:
90
91               add ah, [r10]
92
93           (NASM syntax) is not a legal instruction because the use of r10
94           requires a REX prefix, making it impossible to use ah.
95
96           In 64-bit mode, an additional 8 SSE2 registers are also available.
97           These are named xmm8-xmm15.
98
99       64 Bit Instructions
100           By default, most operations in 64-bit mode remain 32-bit;
101           operations that are 64-bit usually require a REX prefix (one bit in
102           the REX prefix determines whether an operation is 64-bit or
103           32-bit). Thus, essentially all 32-bit instructions have a 64-bit
104           version, and the 64-bit versions of instructions can use extended
105           registers “for free” (as the REX prefix is already present).
106           Examples in NASM syntax:
107
108               mov eax, 1  ; 32-bit instruction
109
110               mov rcx, 1  ; 64-bit instruction
111
112           Instructions that modify the stack (push, pop, call, ret, enter,
113           and leave) are implicitly 64-bit. Their 32-bit counterparts are not
114           available, but their 16-bit counterparts are. Examples in NASM
115           syntax:
116
117               push eax  ; illegal instruction
118
119               push rbx  ; 1-byte instruction
120
121               push r11  ; 2-byte instruction with REX prefix
122
123       Implicit Zero Extension
124           Results of 32-bit operations are implicitly zero-extended to the
125           upper 32 bits of the corresponding 64-bit register. 16 and 8 bit
126           operations, on the other hand, do not affect upper bits of the
127           register (just as in 32-bit and 16-bit modes). This can be used to
128           generate smaller code in some instances. Examples in NASM syntax:
129
130               mov ecx, 1  ; 1 byte shorter than mov rcx, 1
131
132               and edx, 3  ; equivalent to and rdx, 3
133
134       Immediates
135           For most instructions in 64-bit mode, immediate values remain 32
136           bits; their value is sign-extended into the upper 32 bits of the
137           target register prior to being used. The exception is the mov
138           instruction, which can take a 64-bit immediate when the destination
139           is a 64-bit register. Examples in NASM syntax:
140
141               add rax, 1           ; optimized down to signed 8-bit
142
143               add rax, dword 1     ; force size to 32-bit
144
145               add rax, 0xffffffff  ; sign-extended 32-bit
146
147               add rax, -1          ; same as above
148
149               add rax, 0xffffffffffffffff ; truncated to 32-bit (warning)
150
151               mov eax, 1           ; 5 byte
152
153               mov rax, 1           ; 5 byte (optimized to signed 32-bit)
154
155               mov rax, qword 1     ; 10 byte (forced 64-bit)
156
157               mov rbx, 0x1234567890abcdef ; 10 byte
158
159               mov rcx, 0xffffffff  ; 10 byte (does not fit in signed 32-bit)
160
161               mov ecx, -1          ; 5 byte, equivalent to above
162
163               mov rcx, sym         ; 5 byte, 32-bit size default for symbols
164
165               mov rcx, qword sym   ; 10 byte, override default size
166
167           The handling of mov reg64, unsized immediate is different between
168           YASM and NASM 2.x; YASM follows the above behavior, while NASM 2.x
169           does the following:
170
171               add rax, 0xffffffff  ; sign-extended 32-bit immediate
172
173               add rax, -1          ; same as above
174
175               add rax, 0xffffffffffffffff ; truncated 32-bit (warning)
176
177               add rax, sym         ; sign-extended 32-bit immediate
178
179               mov eax, 1           ; 5 byte (32-bit immediate)
180
181               mov rax, 1           ; 10 byte (64-bit immediate)
182
183               mov rbx, 0x1234567890abcdef ; 10 byte instruction
184
185               mov rcx, 0xffffffff  ; 10 byte instruction
186
187               mov ecx, -1          ; 5 byte, equivalent to above
188
189               mov ecx, sym         ; 5 byte (32-bit immediate)
190
191               mov rcx, sym         ; 10 byte instruction
192
193               mov rcx, qword sym   ; 10 byte (64-bit immediate)
194
195       Displacements
196           Just like immediates, displacements, for the most part, remain 32
197           bits and are sign extended prior to use. Again, the exception is
198           one restricted form of the mov instruction: between the
199           al/ax/eax/rax register and a 64-bit absolute address (no registers
200           allowed in the effective address). In NASM syntax, use of the
201           64-bit absolute form requires [qword]. Examples in NASM syntax:
202
203               mov eax, [1]    ; 32 bit, with sign extension
204
205               mov al, [rax-1] ; 32 bit, with sign extension
206
207               mov al, [qword 0x1122334455667788] ; 64-bit absolute
208
209               mov al, [0x1122334455667788] ; truncated to 32-bit (warning)
210
211       RIP Relative Addressing
212           In 64-bit mode, a new form of effective addressing is available to
213           make it easier to write position-independent code. Any memory
214           reference may be made RIP relative (RIP is the instruction pointer
215           register, which contains the address of the location immediately
216           following the current instruction).
217
218           In NASM syntax, there are two ways to specify RIP-relative
219           addressing:
220
221               mov dword [rip+10], 1
222
223           stores the value 1 ten bytes after the end of the instruction.  10
224           can also be a symbolic constant, and will be treated the same way.
225           On the other hand,
226
227               mov dword [symb wrt rip], 1
228
229           stores the value 1 into the address of symbol symb. This is
230           distinctly different than the behavior of:
231
232               mov dword [symb+rip], 1
233
234           which takes the address of the end of the instruction, adds the
235           address of symb to it, then stores the value 1 there. If symb is a
236           variable, this will not store the value 1 into the symb variable!
237
238           Yasm also supports the following syntax for RIP-relative
239           addressing:
240
241               mov [rel sym], rax  ; RIP-relative
242
243               mov [abs sym], rax  ; not RIP-relative
244
245           The behavior of:
246
247               mov [sym], rax
248
249           Depends on a mode set by the DEFAULT directive, as follows. The
250           default mode is always "abs", and in "rel" mode, use of registers,
251           an fs or gs segment override, or an explicit "abs" override will
252           result in a non-RIP-relative effective address.
253
254               default rel
255
256               mov [sym], rbx      ; RIP-relative
257
258               mov [abs sym], rbx  ; not RIP-relative (explicit override)
259
260               mov [rbx+1], rbx    ; not RIP-relative (register use)
261
262               mov [fs:sym], rbx   ; not RIP-relative (fs or gs use)
263
264               mov [ds:sym], rbx   ; RIP-relative (segment, but not fs or gs)
265
266               mov [rel sym], rbx  ; RIP-relative (redundant override)
267
268               default abs
269
270               mov [sym], rbx      ; not RIP-relative
271
272               mov [abs sym], rbx  ; not RIP-relative
273
274               mov [rbx+1], rbx    ; not RIP-relative
275
276               mov [fs:sym], rbx   ; not RIP-relative
277
278               mov [ds:sym], rbx   ; not RIP-relative
279
280               mov [rel sym], rbx  ; RIP-relative (explicit override)
281
282       Memory references
283           Usually the size of a memory reference can be deduced by which
284           registers you're moving--for example, "mov [rax],ecx" is a 32-bit
285           move, because ecx is 32 bits. YASM currently gives the non-obvious
286           "invalid combination of opcode and operands" error if it can't
287           figure out how much memory you're moving. The fix in this case is
288           to add a memory size specifier: qword, dword, word, or byte.
289
290           Here's a 64-bit memory move, which sets 8 bytes starting at rax:
291
292               mov qword [rax], 1
293
294           Here's a 32-bit memory move, which sets 4 bytes:
295
296               mov dword [rax], 1
297
298           Here's a 16-bit memory move, which sets 2 bytes:
299
300               mov word [rax], 1
301
302           Here's an 8-bit memory move, which sets 1 byte:
303
304               mov byte [rax], 1
305

LC3B ARCHITECTURE

307       The “lc3b” architecture supports the LC-3b ISA as used in the ECE 312
308       (now ECE 411) course at the University of Illinois, Urbana-Champaign,
309       as well as other university courses. See
310       http://courses.ece.uiuc.edu/ece411/ for more details and example code.
311       The “lc3b” architecture consists of only one machine: “lc3b”.
312

SEE ALSO

314       yasm(1)
315

BUGS

317       When using the “x86” architecture, it is overly easy to generate AMD64
318       code (using the BITS 64 directive) and generate a 32-bit object file
319       (by failing to specify -m amd64 on the command line or selecting a
320       64-bit object format). Similarly, specifying -m amd64 does not default
321       the BITS setting to 64. An easy way to avoid this is by directly
322       specifying a 64-bit object format such as -f elf64.
323

AUTHOR

325       Peter Johnson <peter@tortall.net>
326           Author.
327
329       Copyright © 2004, 2005, 2006, 2007 Peter Johnson
330
331
332
333Yasm                             October 2006                     YASM_ARCH(7)
Impressum