1XA(1) General Commands Manual XA(1)
2
3
4
6 xa - 6502/R65C02/65816 cross-assembler
7
8
10 xa [OPTION]... FILE
11
12
14 xa is a multi-pass cross-assembler for the 8-bit processors in the 6502
15 series (such as the 6502, 65C02, 6504, 6507, 6510, 7501, 8500, 8501 and
16 8502), the Rockwell R65C02, and the 16-bit 65816 processor. For a de‐
17 scription of syntax, see ASSEMBLER SYNTAX further in this manual page.
18
19
21 -v Verbose output.
22
23 -C No CMOS opcodes (default is to allow R65C02 opcodes).
24
25 -W No 65816 opcodes (default).
26
27 -w Allow 65816 opcodes.
28
29 -B Show lines with block open/close (see PSEUDO-OPS).
30
31 -c Produce o65 object files instead of executable files (no linking
32 performed); files may contain undefined references.
33
34 -o filename
35 Set output filename. The default is a.o65; use the special file‐
36 name - to output to standard output.
37
38 -e filename
39 Set errorlog filename, default is none.
40
41 -l filename
42 Set labellist filename, default is none. This is the symbol ta‐
43 ble and can be used by disassemblers such as dxa(1) to recon‐
44 struct source.
45
46 -r Add cross-reference list to labellist (requires -l).
47
48 -M Allow colons to appear in comments; for MASM compatibility. This
49 does not affect colon interpretation elsewhere.
50
51 -R Start assembler in relocating mode.
52
53 -Llabel
54 Defines label as an absolute (but undefined) label even when
55 linking.
56
57 -b? addr
58 Set segment base for segment ? to address addr. ? should be
59 t, d, b or z for text, data, bss or zero segments, respectively.
60
61 -A addr
62 Make text segment start at an address such that when the file
63 starts at address addr, relocation is not necessary. Overrides
64 -bt; other segments still have to be taken care of with -b.
65
66
67 -G Suppress list of exported globals.
68
69 -DDEF=TEXT
70 Define a preprocessor macro on the command line (see PREPROCES‐
71 SOR).
72
73 -I dir Add directory dir to the include path (before XAINPUT; see ENVI‐
74 RONMENT).
75
76 -O charset
77 Define the output charset for character strings. Currently sup‐
78 ported are ASCII (default), PETSCII (Commodore ASCII), PETSCREEN
79 (Commodore screen codes) and HIGH (set high bit on all charac‐
80 ters).
81
82 -p? Set the alternative preprocessor character to ?. This is useful
83 when you wish to use cpp(1) and the built-in preprocessor at the
84 same time (see PREPROCESSOR). Characters may need to be quoted
85 for your shell (example: -p'~' ).
86
87 --help Show summary of options.
88
89 --version
90 Show version of program.
91
92 The following options are deprecated and will be removed in 2.4 and
93 later versions:
94
95 -x Use old filename behaviour (overrides -o, -e and -l).
96
97 -S Allow preprocessor substitution within strings (this is now dis‐
98 allowed for better cpp(1) compatibility).
99
100
102 An introduction to 6502 assembly language programming and mnemonics is
103 beyond the scope of this manual page. We invite you to investigate any
104 number of the excellent books on the subject; one useful title is "Ma‐
105 chine Language For Beginners" by Richard Mansfield (COMPUTE!), covering
106 the Atari, Commodore and Apple 8-bit systems, and is widely available
107 on the used market.
108
109 xa supports both the standard NMOS 6502 opcodes as well as the Rockwell
110 CMOS opcodes used in the 65C02 (R65C02). With the -w option, xa will
111 also accept opcodes for the 65816. NMOS 6502 undocumented opcodes are
112 intentionally not supported, and should be entered manually using the
113 .byte pseudo-op (see PSEUDO-OPS). Due to conflicts between the R65C02
114 and 65816 instruction sets and undocumented instructions on the NMOS
115 6502, their use is discouraged.
116
117 In general, xa accepts the more-or-less standard 6502 assembler format
118 as popularised by MASM and TurboAssembler. Values and addresses can be
119 expressed either as literals, or as expressions; to wit,
120
121 123 decimal value
122
123 $234 hexadecimal value
124
125 &123 octal
126
127 %010110 binary
128
129 * current value of the program counter
130
131 The ASCII value of any quoted character is inserted directly into the
132 program text (example: "A" inserts the byte "A" into the output
133 stream); see also the PSEUDO-OPS section. This is affected by the cur‐
134 rently selected character set, if any.
135
136 Labels define locations within the program text, just as in other
137 multi-pass assemblers. A label is defined by anything that is not an
138 opcode; for example, a line such as
139
140 label1 lda #0
141
142 defines label1 to be the current location of the program counter (thus
143 the address of the LDA opcode). A label can be explicitly defined by
144 assigning it the value of an expression, such as
145
146 label2 = $d000
147
148 which defines label2 to be the address $d000, namely, the start of the
149 VIC-II register block on Commodore 64 computers. The program counter *
150 is considered to be a special kind of label, and can be assigned to
151 with statements such as
152
153 * = $c000
154
155 which sets the program counter to decimal location 49152. With the ex‐
156 ception of the program counter, labels cannot be assigned multiple
157 times. To explicitly declare redefinition of a label, place a - (dash)
158 before it, e.g.,
159
160 -label2 = $d020
161
162 which sets label2 to the Commodore 64 border colour register. The scope
163 of a label is affected by the block it resides within (see PSEUDO-OPS
164 for block instructions). A label may also be hard-specified with the -L
165 command line option.
166
167 Redefining a label does not change previously assembled code that used
168 the earlier value. Therefore, because the program counter is a special
169 type of label, changing the program counter to a lower value does not
170 reorder code assembled previously and changing it to a higher value
171 does not issue padding to put subsequent code at the new location. This
172 is intentional behaviour to facilitate generating relocatable and posi‐
173 tion-independent code, but can differ from other assemblers which use
174 this behaviour for linking. However, it is possible to use pseudo-ops
175 to simulate other assemblers' behaviour and use xa as a linker; see
176 PSEUDO-OPS and LINKING.
177
178 For those instructions where the accumulator is the implied argument
179 (such as asl and lsr; inc and dec on R65C02; etc.), the idiom of ex‐
180 plicitly specifying the accumulator with a is unnecessary as the proper
181 form will be selected if there is no explicit argument. In fact, for
182 consistency with label handling, if there is a label named a, this will
183 actually generate code referencing that label as a memory location and
184 not the accumulator. Otherwise, the assembler will complain.
185
186 Labels and opcodes may take expressions as their arguments to allow
187 computed values, and may themselves reference other labels and/or the
188 program counter. An expression such as lab1+1 (which operates on the
189 current value of label lab1 and increments it by one) may use the fol‐
190 lowing operands, given from highest to lowest priority:
191
192 * multiplication (priority 10)
193
194 / integer division (priority 10)
195
196 + addition (priority 9)
197
198 - [22msubtraction (9)
199
200 << shift left (8)
201
202 >> shift right (8)
203
204 >= => greater than or equal to (7)
205
206 > greater than (7)
207
208 <= =< less than or equal to (7)
209
210 < less than (7)
211
212 = equal to (6)
213
214 <> >< does not equal (6)
215
216 & bitwise AND (5)
217
218 ^ bitwise XOR (4)
219
220 | bitwise OR (3)
221
222 && logical AND (2)
223
224 || logical OR (1)
225
226 Parentheses are valid. When redefining a label, combining arithmetic or
227 bitwise operators with the = (equals) operator such as += and so on are
228 valid, e.g.,
229
230 -redeflabel += (label12/4)
231
232 Normally, xa attempts to ascertain the value of the operand and (when
233 referring to a memory location) use zero page, 16-bit or (for 65816)
234 24-bit addressing where appropriate and where supported by the particu‐
235 lar opcode. This generates smaller and faster code, and is almost al‐
236 ways preferable.
237
238 Nevertheless, you can use these prefix operators to force a particular
239 rendering of the operand. Those that generate an eight bit result can
240 also be used in 8-bit addressing modes, such as immediate and zero
241 page.
242
243 < low byte of expression, e.g., lda #<vector
244
245 > high byte of expression
246
247 ! in situations where the expression could be understood as either
248 an absolute or zero page value, do not attempt to optimize to a
249 zero page argument for those opcodes that support it (i.e., keep
250 as 16 bit word)
251
252 @ render as 24-bit quantity for 65816 (must specify -w command-
253 line option). This is required to specify any 24-bit quantity!
254
255 ` force further optimization, even if the length of the instruc‐
256 tion cannot be reliably determined (see NOTES'N'BUGS)
257
258 Expressions can occur as arguments to opcodes or within the preproces‐
259 sor (see PREPROCESSOR for syntax). For example,
260
261 lda label2+1
262
263 takes the value at label2+1 (using our previous label's value, this
264 would be $d021), and will be assembled as $ad $21 $d0 to disk. Simi‐
265 larly,
266
267 lda #<label2
268
269 will take the lowest 8 bits of label2 (i.e., $20), and assign them to
270 the accumulator (assembling the instruction as $a9 $20 to disk).
271
272 Comments are specified with a semicolon (;), such as
273
274 ;this is a comment
275
276 They can also be specified in the C language style, using /* */ and //
277 which are understood at the PREPROCESSOR level (q.v.).
278
279 Normally, the colon (:) separates statements, such as
280
281 label4 lda #0:sta $d020
282
283 or
284
285 label2: lda #2
286
287 (note the use of a colon for specifying a label, similar to some other
288 assemblers, which xa also understands with or without the colon). This
289 also applies to semicolon comments, such that
290
291 ; a comment:lda #0
292
293 is understood as a comment followed by an opcode. To defeat this, use
294 the -M command line option to allow colons within comments. This does
295 not apply to /* */ and // comments, which are dealt with at the pre‐
296 processor level (q.v.).
297
298
300 Pseudo-ops are false opcodes used by the assembler to denote meta- or
301 inlined commands. Like most assemblers, xa has a rich set.
302
303 .byt value1,value2,value3,...
304 Specifies a string of bytes to be directly placed into the as‐
305 sembled object. The arguments may be expressions. Any number of
306 bytes can be specified.
307
308 .asc "text1" ,"text2",...
309 Specifies a character string which will be inserted into the as‐
310 sembled object. Strings are understood according to the cur‐
311 rently specified character set; for example, if ASCII is speci‐
312 fied, they will be rendered as ASCII, and if PETSCII is speci‐
313 fied, they will be translated into the equivalent Commodore
314 ASCII equivalent. Other non-standard ASCIIs such as ATASCII for
315 Atari computers should use the ASCII equivalent characters;
316 graphic and control characters should be specified explicitly
317 using .byt for the precise character you want. Note that when
318 specifying the argument of an opcode, .asc is not necessary; the
319 quoted character can simply be inserted (e.g., lda #"A" ), and
320 is also affected by the current character set. Any number of
321 character strings can be specified.
322
323 .byt and .asc are synonymous, so you can mix things such as .byt $43,
324 22, "a character string" and get the expected result. The string is
325 subject to the current character set, but the remaining bytes are in‐
326 serted wtihout modification.
327
328 .aasc "text1" ,"text2",...
329 Specifies a character string that is always rendered in true
330 ASCII regardless of the current character set. Like .asc, it is
331 synonymous with .byt.
332
333 .word value1,value2,value3...
334 Specifies a string of 16-bit words to be placed into the assem‐
335 bled object in 6502 little-endian format (that is, low-
336 byte/high-byte). The arguments may be expressions. Any number of
337 words can be specified.
338
339 .dsb length,fillbyte
340 Specifies a data block; a total of length repetitions of fill‐
341 byte will be inserted into the assembled object. For example,
342 .dsb 5,$10 will insert five bytes, each being 16 decimal, into
343 the object. The arguments may be expressions. See LINKING for
344 how to use this pseudo-op to link multiple objects.
345
346 .bin offset,length,"filename"
347 Inlines a binary file without further interpretation specified
348 by filename from offset offset to length length. This allows
349 you to insert data such as a previously assembled object file or
350 an image or other binary data structure, inlined directly into
351 this file's object. If length is zero, then the length of file‐
352 name, minus the offset, is used instead. The arguments may be
353 expressions. See LINKING for how to use this pseudo-op to link
354 multiple objects.
355
356 .( Opens a new block for scoping. Within a block, all labels de‐
357 fined are local to that block and any sub-blocks, and go out of
358 scope as soon as the enclosing block is closed (i.e., lexically
359 scoped). All labels defined outside of the block are still visi‐
360 ble within it. To explicitly declare a global label within a
361 block, precede the label with + or precede it with & to declare
362 it within the previous level only (or globally if you are only
363 one level deep). Sixteen levels of scoping are permitted.
364
365 .) Closes a block.
366
367 .as .al .xs .xl
368 Only relevant in 65816 mode (with the -w option specified).
369 These pseudo-ops set what size accumulator and X/Y-register
370 should be used for future instructions; .as and .xs set 8-bit
371 operands for the accumulator and index registers, respectively,
372 and .al and .xl set 16-bit operands. These pseudo-ops on purpose
373 do not automatically issue sep and rep instructions to set the
374 specified width in the CPU; set the processor bits as you need,
375 or consider constructing a macro. .al and .xl generate errors
376 if -w is not specified.
377
378 The following pseudo-ops apply primarily to relocatable .o65 objects.
379 A full discussion of the relocatable format is beyond the scope of this
380 manpage, as it is currently a format in flux. Documentation on the pro‐
381 posed v1.2 format is in doc/fileformat.txt within the xa installation
382 directory.
383
384 .text .data .bss .zero
385 These pseudo-ops switch between the different segments, .text
386 being the actual code section, .data being the data segment,
387 .bss being uninitialized label space for allocation and .zero
388 being uninitialized zero page space for allocation. In .bss and
389 .zero, only labels are evaluated. These pseudo-ops are valid in
390 relative and absolute modes.
391
392 .align value
393 Aligns the current segment to a byte boundary (2, 4 or 256) as
394 specified by value (and places it in the header when relative
395 mode is enabled). Other values generate an error.
396
397 .fopt type,value1,value2,value3,...
398 Acts like .byt/.asc except that the values are embedded into the
399 object file as file options. The argument type is used to spec‐
400 ify the file option being referenced. A table of these options
401 is in the relocatable o65 file format description. The remainder
402 of the options are interpreted as values to insert. Any number
403 of values may be specified, and may also be strings.
404
405
407 xa implements a preprocessor very similar to that of the C-language
408 preprocessor cpp(1) and many oddiments apply to both. For example, as
409 in C, the use of /* */ for comment delimiters is also supported in xa,
410 and so are comments using the double slash //. The preprocessor also
411 supports continuation lines, i.e., lines ending with a backslash (\);
412 the following line is then appended to it as if there were no dividing
413 newline. This too is handled at the preprocessor level.
414
415 For reasons of memory and complexity, the full breadth of the cpp(1)
416 syntax is not fully supported. In particular, macro definitions may not
417 be forward-defined (i.e., a macro definition can only reference a pre‐
418 viously defined macro definition), except for macro functions, where
419 recursive evaluation is supported; e.g., to #define WW AA , AA must
420 have already been defined. Certain other directives are not supported,
421 nor are most standard pre-defined macros, and there are other limits on
422 evaluation and line length. Because the maintainers of xa recognize
423 that some files will require more complicated preparsing than the
424 built-in preprocessor can supply, the preprocessor will accept
425 cpp(1)-style line/filename/flags output. When these lines are seen in
426 the input file, xa will treat them as cc would, except that flags are
427 ignored. xa does not accept files on standard input for parsing rea‐
428 sons, so you should dump your cpp(1) output to an intermediate tempo‐
429 rary file, such as
430
431 cc -E test.s > test.xa
432 xa test.xa
433
434 No special arguments need to be passed to xa; the presence of cpp(1)
435 output is detected automatically.
436
437 Note that passing your file through cpp(1) may interfere with xa's own
438 preprocessor directives. In this case, to mask directives from cpp(1),
439 use the -p option to specify an alternative character instead of #,
440 such as the tilde (e.g., -p'~' ). With this option and argument speci‐
441 fied, then instead of #include, for example, you can also use ~include,
442 in addition to #include (which will also still be accepted by the xa
443 preprocessor, assuming any survive cpp(1)). Any character can be used,
444 although frankly pathologic choices may lead to amusing and frustrating
445 glitches during parsing. You can also use this option to defer pre‐
446 processor directives that cpp(1) may interpret too early until the file
447 actually gets to xa itself for processing.
448
449 The following preprocessor directives are supported.
450
451
452 #include "filename"
453 Inserts the contents of file filename at this position. If the
454 file is not found, it is searched using paths specified by the
455 -I command line option or the environment variable XAINPUT
456 (q.v.). When inserted, the file will also be parsed for pre‐
457 processor directives.
458
459 #echo comment
460 Inserts comment comment into the errorlog file, specified with
461 the -e command line option.
462
463 #print expression
464 Computes the value of expression expression and prints it into
465 the errorlog file.
466
467 #define DEFINE text
468 Equates macro DEFINE with text text such that wherever DEFINE
469 appears in the assembly source, text is substituted in its place
470 (just like cpp(1) would do). In addition, #define can specify
471 macro functions like cpp(1) such that a directive like #define
472 mult(a,b) ((a)*(b)) would generate the expected result wherever
473 an expression of the form mult(a,b) appears in the source. This
474 can also be specified on the command line with the -D option.
475 The arguments of a macro function may be recursively evaluated,
476 unlike other #defines; the preprocessor will attempt to re-eval‐
477 uate any argument refencing another preprocessor definition up
478 to ten times before complaining.
479
480 The following directives are conditionals. If the conditional is not
481 satisfied, then the source code between the directive and its terminat‐
482 ing #endif are expunged and not assembled. Up to fifteen levels of
483 nesting are supported.
484
485 #endif Closes a conditional block.
486
487 #else Implements alternate path for a conditional block.
488
489 #ifdef DEFINE
490 True only if macro DEFINE is defined.
491
492 #ifndef DEFINE
493 The opposite; true only if macro DEFINE has not been previously
494 defined.
495
496 #if expression
497 True if expression expression evaluates to non-zero. expression
498 may reference other macros.
499
500 #iflused label
501 True if label label has been used (but not necessarily instanti‐
502 ated with a value). This works on labels, not macros!
503
504 #ifldef label
505 True if label label is defined and assigned with a value. This
506 works on labels, not macros!
507
508 Unclosed conditional blocks at the end of included files generate warn‐
509 ings; unclosed conditional blocks at the end of assembly generate an
510 error.
511
512 #iflused and #ifldef are useful for building up a library based on la‐
513 bels. For example, you might use something like this in your library's
514 code:
515
516 #iflused label
517 #ifldef label
518 #echo label already defined, library function label cannot be
519 inserted
520 #else
521 label /* your code */
522 #endif
523 #endif
524
525
527 xa is oriented towards generating sequential binaries. Code is strictly
528 emitted in order even if the program counter is set to a lower location
529 than previously assembled code, and padding is not automatically emit‐
530 ted if the program counter is set to a higher location. Changing the
531 program location only changes new labels for code that is subsequently
532 emitted; previous emitted code remains unchanged. Fortunately, for many
533 object files these conventions have no effect on their generation.
534
535 However, some applications may require generating an object file built
536 from several previously generated components, and/or submodules which
537 may need to be present at specific memory locations. With a minor
538 amount of additional specification, it is possible to use xa for this
539 purpose as well.
540
541 The first means of doing so uses the o65 format to make relocatable ob‐
542 jects that in turn can be linked by ldo65(1) (q.v.).
543
544 The second means involves either assembled code, or insertion of previ‐
545 ously built object or data files with .bin, using .dsb pseudo-ops with
546 computed expression arguments to insert any necessary padding between
547 them, in the sequential order they are to reside in memory. Consider
548 this example:
549
550 .word $1000
551 * = $1000
552
553 ; this is your code at $1000
554 part1 rts
555 ; this label marks the end of code
556 endofpart1
557
558 ; DON'T PUT A NEW .word HERE!
559 * = $2000
560 .dsb (*-endofpart1), 0
561 ; yes, set it again
562 * = $2000
563
564 ; this is your code at $2000
565 part2 rts
566
567 This example, written for Commodore microcomputers using a 16-bit
568 starting address, has two "modules" in it: one block of code at $1000
569 (4096), indicated by the code between labels part1 and endofpart1, and
570 a second block at $2000 (8192) starting at label part2.
571
572 The padding is computed by the .dsb pseudo-op between the two modules.
573 Note that the program counter is set to the new address and then a com‐
574 puted expression inserts the proper number of fill bytes from the end
575 of the assembled code in part 1 up to the new program counter address.
576 Since this itself advances the program counter, the program counter is
577 reset again, and assembly continues.
578
579 When the object this source file generates is loaded, there will be an
580 rts instruction at address 4096 and another at address 8192, with null
581 bytes between them.
582
583 Should one of these areas need to contain a pre-built file, instead of
584 assembly code, simply use a .bin pseudo-op to load whatever portions of
585 the file are required into the output. The computation of addresses and
586 number of necessary fill bytes is done in the same fashion.
587
588 Although this example used the program counter itself to compute the
589 difference between addresses, you can use any label for this purpose,
590 keeping in mind that only the program counter determines where relative
591 addresses within assembled code are resolved.
592
593
595 xa utilises the following environment variables, if they exist:
596
597
598 XAINPUT
599 Include file path; components should be separated by `,'.
600
601 XAOUTPUT
602 Output file path.
603
604
606 The R65C02 instructions ina (often rendered inc a) and dea (dec a) must
607 be rendered as bare inc and dec instructions respectively.
608
609 The 65816 instructions mvn and mvp use two eight bit parameters, the
610 only instructions in the entire instruction set to do so. Older ver‐
611 sions of xa took a single 16-bit absolute value. Since 2.3.7, the stan‐
612 dard syntax is now accepted and the old syntax is deprecated (a warning
613 will be generated).
614
615 Forward-defined labels -- that is, labels that are defined after the
616 current instruction is processed -- cannot be optimized into zero page
617 instructions even if the label does end up being defined as a zero page
618 location, because the assembler does not know the value of the label in
619 advance during the first pass when the length of an instruction is com‐
620 puted. On the second pass, a warning will be issued when an instruction
621 that could have been optimized can't be because of this limitation.
622 (Obviously, this does not apply to branching or jumping instructions
623 because they're not optimizable anyhow, and those instructions that can
624 only take an 8-bit parameter will always be casted to an 8-bit quan‐
625 tity.) If the label cannot otherwise be defined ahead of the instruc‐
626 tion, the backtick prefix ` may be used to force further optimization
627 no matter where the label is defined as long as the instruction sup‐
628 ports it. Indiscriminately forcing the issue can be fraught with
629 peril, however, and is not recommended; to discourage this, the assem‐
630 bler will complain about its use in addressing mode situations where no
631 ambiguity exists, such as indirect indexed, branching and so on.
632
633 Also, as a further consequence of the way optimization is managed, we
634 repeat that all 24-bit quantities and labels that reference a 24-bit
635 quantity in 65816 mode, anteriorly declared or otherwise, MUST be
636 prepended with the @ prefix. Otherwise, the assembler will attempt to
637 optimize to 16 bits, which may be undesirable.
638
639
641 The following options and modes will be REMOVED in 2.4 and later ver‐
642 sions of xa:
643
644 -x
645
646 -S
647
648 the original mvn $xxxx syntax
649
650
652 file65(1), ldo65(1), printcbm(1), reloc65(1), uncpk(1), dxa(1)
653
654
656 This manual page was written by David Weinehall <tao@acc.umu.se>, Andre
657 Fachat <fachat@web.de> and Cameron Kaiser <ckaiser@floodgap.com>.
658 Original xa package (C)1989-1997 Andre Fachat. Additional changes
659 (C)1989-2023 Andre Fachat, Jolse Maginnis, David Weinehall, Cameron
660 Kaiser. The official maintainer is Cameron Kaiser.
661
662
664 Yay us?
665
666
668 http://www.floodgap.com/retrotech/xa/
669
670
671
672 24 November 2021 XA(1)