summaryrefslogtreecommitdiff
path: root/sys/doc/asm.ms
diff options
context:
space:
mode:
authoraiju <aiju@phicode.de>2011-07-18 11:01:22 +0200
committeraiju <aiju@phicode.de>2011-07-18 11:01:22 +0200
commit8c4c1f39f4e369d7c590c9d119f1150a2215e56d (patch)
treecd430740860183fc01de1bc1ddb216ceff1f7173 /sys/doc/asm.ms
parent11bf57fb2ceb999e314cfbe27a4e123bf846d2c8 (diff)
added /sys/doc
Diffstat (limited to 'sys/doc/asm.ms')
-rw-r--r--sys/doc/asm.ms1431
1 files changed, 1431 insertions, 0 deletions
diff --git a/sys/doc/asm.ms b/sys/doc/asm.ms
new file mode 100644
index 000000000..655dbb93b
--- /dev/null
+++ b/sys/doc/asm.ms
@@ -0,0 +1,1431 @@
+.HTML "A Manual for the Plan 9 assembler
+.ft CW
+.ta 8n +8n +8n +8n +8n +8n +8n
+.ft
+.TL
+A Manual for the Plan 9 assembler
+.AU
+Rob Pike
+rob@plan9.bell-labs.com
+.SH
+Machines
+.PP
+There is an assembler for each of the MIPS, SPARC, Intel 386,
+Intel 960, AMD 29000, Motorola 68020 and 68000, Motorola Power PC,
+AMD64, DEC Alpha, and Acorn ARM.
+The 68020 assembler,
+.CW 2a ,
+is the oldest and in many ways the prototype.
+The assemblers are really just variations of a single program:
+they share many properties such as left-to-right assignment order for
+instruction operands and the synthesis of macro instructions
+such as
+.CW MOVE
+to hide the peculiarities of the load and store structure of the machines.
+To keep things concrete, the first part of this manual is
+specifically about the 68020.
+At the end is a description of the differences among
+the other assemblers.
+.PP
+The document, ``How to Use the Plan 9 C Compiler'', by Rob Pike,
+is a prerequisite for this manual.
+.SH
+Registers
+.PP
+All pre-defined symbols in the assembler are upper-case.
+Data registers are
+.CW R0
+through
+.CW R7 ;
+address registers are
+.CW A0
+through
+.CW A7 ;
+floating-point registers are
+.CW F0
+through
+.CW F7 .
+.PP
+A pointer in
+.CW A6
+is used by the C compiler to point to data, enabling short addresses to
+be used more often.
+The value of
+.CW A6
+is constant and must be set during C program initialization
+to the address of the externally-defined symbol
+.CW a6base .
+.PP
+The following hardware registers are defined in the assembler; their
+meaning should be obvious given a 68020 manual:
+.CW CAAR ,
+.CW CACR ,
+.CW CCR ,
+.CW DFC ,
+.CW ISP ,
+.CW MSP ,
+.CW SFC ,
+.CW SR ,
+.CW USP ,
+and
+.CW VBR .
+.PP
+The assembler also defines several pseudo-registers that
+manipulate the stack:
+.CW FP ,
+.CW SP ,
+and
+.CW TOS .
+.CW FP
+is the frame pointer, so
+.CW 0(FP)
+is the first argument,
+.CW 4(FP)
+is the second, and so on.
+.CW SP
+is the local stack pointer, where automatic variables are held
+(SP is a pseudo-register only on the 68020);
+.CW 0(SP)
+is the first automatic, and so on as with
+.CW FP .
+Finally,
+.CW TOS
+is the top-of-stack register, used for pushing parameters to procedures,
+saving temporary values, and so on.
+.PP
+The assembler and loader track these pseudo-registers so
+the above statements are true regardless of what has been
+pushed on the hardware stack, pointed to by
+.CW A7 .
+The name
+.CW A7
+refers to the hardware stack pointer, but beware of mixed use of
+.CW A7
+and the above stack-related pseudo-registers, which will cause trouble.
+Note, too, that the
+.CW PEA
+instruction is observed by the loader to
+alter SP and thus will insert a corresponding pop before all returns.
+The assembler accepts a label-like name to be attached to
+.CW FP
+and
+.CW SP
+uses, such as
+.CW p+0(FP) ,
+to help document that
+.CW p
+is the first argument to a routine.
+The name goes in the symbol table but has no significance to the result
+of the program.
+.SH
+Referring to data
+.PP
+All external references must be made relative to some pseudo-register,
+either
+.CW PC
+(the virtual program counter) or
+.CW SB
+(the ``static base'' register).
+.CW PC
+counts instructions, not bytes of data.
+For example, to branch to the second following instruction, that is,
+to skip one instruction, one may write
+.P1
+ BRA 2(PC)
+.P2
+Labels are also allowed, as in
+.P1
+ BRA return
+ NOP
+return:
+ RTS
+.P2
+When using labels, there is no
+.CW (PC)
+annotation.
+.PP
+The pseudo-register
+.CW SB
+refers to the beginning of the address space of the program.
+Thus, references to global data and procedures are written as
+offsets to
+.CW SB ,
+as in
+.P1
+ MOVL $array(SB), TOS
+.P2
+to push the address of a global array on the stack, or
+.P1
+ MOVL array+4(SB), TOS
+.P2
+to push the second (4-byte) element of the array.
+Note the use of an offset; the complete list of addressing modes is given below.
+Similarly, subroutine calls must use
+.CW SB :
+.P1
+ BSR exit(SB)
+.P2
+File-static variables have syntax
+.P1
+ local<>+4(SB)
+.P2
+The
+.CW <>
+will be filled in at load time by a unique integer.
+.PP
+When a program starts, it must execute
+.P1
+ MOVL $a6base(SB), A6
+.P2
+before accessing any global data.
+(On machines such as the MIPS and SPARC that cannot load a register
+in a single instruction, constants are loaded through the static base
+register. The loader recognizes code that initializes the static
+base register and treats it specially. You must be careful, however,
+not to load large constants on such machines when the static base
+register is not set up, such as early in interrupt routines.)
+.SH
+Expressions
+.PP
+Expressions are mostly what one might expect.
+Where an offset or a constant is expected,
+a primary expression with unary operators is allowed.
+A general C constant expression is allowed in parentheses.
+.PP
+Source files are preprocessed exactly as in the C compiler, so
+.CW #define
+and
+.CW #include
+work.
+.SH
+Addressing modes
+.PP
+The simple addressing modes are shared by all the assemblers.
+Here, for completeness, follows a table of all the 68020 addressing modes,
+since that machine has the richest set.
+In the table,
+.CW o
+is an offset, which if zero may be elided, and
+.CW d
+is a displacement, which is a constant between -128 and 127 inclusive.
+Many of the modes listed have the same name;
+scrutiny of the format will show what default is being applied.
+For instance, indexed mode with no address register supplied operates
+as though a zero-valued register were used.
+For "offset" read "displacement."
+For "\f(CW.s\fP" read one of
+.CW .L ,
+or
+.CW .W
+followed by
+.CW *1 ,
+.CW *2 ,
+.CW *4 ,
+or
+.CW *8
+to indicate the size and scaling of the data.
+.IP
+.TS
+l lfCW.
+data register R0
+address register A0
+floating-point register F0
+special names CAAR, CACR, etc.
+constant $con
+floating point constant $fcon
+external symbol name+o(SB)
+local symbol name<>+o(SB)
+automatic symbol name+o(SP)
+argument name+o(FP)
+address of external $name+o(SB)
+address of local $name<>+o(SB)
+indirect post-increment (A0)+
+indirect pre-decrement -(A0)
+indirect with offset o(A0)
+indexed with offset o()(R0.s)
+indexed with offset o(A0)(R0.s)
+external indexed name+o(SB)(R0.s)
+local indexed name<>+o(SB)(R0.s)
+automatic indexed name+o(SP)(R0.s)
+parameter indexed name+o(FP)(R0.s)
+offset indirect post-indexed d(o())(R0.s)
+offset indirect post-indexed d(o(A0))(R0.s)
+external indirect post-indexed d(name+o(SB))(R0.s)
+local indirect post-indexed d(name<>+o(SB))(R0.s)
+automatic indirect post-indexed d(name+o(SP))(R0.s)
+parameter indirect post-indexed d(name+o(FP))(R0.s)
+offset indirect pre-indexed d(o()(R0.s))
+offset indirect pre-indexed d(o(A0))
+offset indirect pre-indexed d(o(A0)(R0.s))
+external indirect pre-indexed d(name+o(SB))
+external indirect pre-indexed d(name+o(SB)(R0.s))
+local indirect pre-indexed d(name<>+o(SB))
+local indirect pre-indexed d(name<>+o(SB)(R0.s))
+automatic indirect pre-indexed d(name+o(SP))
+automatic indirect pre-indexed d(name+o(SP)(R0.s))
+parameter indirect pre-indexed d(name+o(FP))
+parameter indirect pre-indexed d(name+o(FP)(R0.s))
+.TE
+.in
+.SH
+Laying down data
+.PP
+Placing data in the instruction stream, say for interrupt vectors, is easy:
+the pseudo-instructions
+.CW LONG
+and
+.CW WORD
+(but not
+.CW BYTE )
+lay down the value of their single argument, of the appropriate size,
+as if it were an instruction:
+.P1
+ LONG $12345
+.P2
+places the long 12345 (base 10)
+in the instruction stream.
+(On most machines,
+the only such operator is
+.CW WORD
+and it lays down 32-bit quantities.
+The 386 has all three:
+.CW LONG ,
+.CW WORD ,
+and
+.CW BYTE .
+The AMD64 adds
+.CW QUAD
+to that for 64-bit values.
+The 960 has only one,
+.CW LONG .)
+.PP
+Placing information in the data section is more painful.
+The pseudo-instruction
+.CW DATA
+does the work, given two arguments: an address at which to place the item,
+including its size,
+and the value to place there. For example, to define a character array
+.CW array
+containing the characters
+.CW abc
+and a terminating null:
+.P1
+ DATA array+0(SB)/1, $'a'
+ DATA array+1(SB)/1, $'b'
+ DATA array+2(SB)/1, $'c'
+ GLOBL array(SB), $4
+.P2
+or
+.P1
+ DATA array+0(SB)/4, $"abc\ez"
+ GLOBL array(SB), $4
+.P2
+The
+.CW /1
+defines the number of bytes to define,
+.CW GLOBL
+makes the symbol global, and the
+.CW $4
+says how many bytes the symbol occupies.
+Uninitialized data is zeroed automatically.
+The character
+.CW \ez
+is equivalent to the C
+.CW \e0.
+The string in a
+.CW DATA
+statement may contain a maximum of eight bytes;
+build larger strings piecewise.
+Two pseudo-instructions,
+.CW DYNT
+and
+.CW INIT ,
+allow the (obsolete) Alef compilers to build dynamic type information during the load
+phase.
+The
+.CW DYNT
+pseudo-instruction has two forms:
+.P1
+ DYNT , ALEF_SI_5+0(SB)
+ DYNT ALEF_AS+0(SB), ALEF_SI_5+0(SB)
+.P2
+In the first form,
+.CW DYNT
+defines the symbol to be a small unique integer constant, chosen by the loader,
+which is some multiple of the word size. In the second form,
+.CW DYNT
+defines the second symbol in the same way,
+places the address of the most recently
+defined text symbol in the array specified by the first symbol at the
+index defined by the value of the second symbol,
+and then adjusts the size of the array accordingly.
+.PP
+The
+.CW INIT
+pseudo-instruction takes the same parameters as a
+.CW DATA
+statement. Its symbol is used as the base of an array and the
+data item is installed in the array at the offset specified by the most recent
+.CW DYNT
+pseudo-instruction.
+The size of the array is adjusted accordingly.
+The
+.CW DYNT
+and
+.CW INIT
+pseudo-instructions are not implemented on the 68020.
+.SH
+Defining a procedure
+.PP
+Entry points are defined by the pseudo-operation
+.CW TEXT ,
+which takes as arguments the name of the procedure (including the ubiquitous
+.CW (SB) )
+and the number of bytes of automatic storage to pre-allocate on the stack,
+which will usually be zero when writing assembly language programs.
+On machines with a link register, such as the MIPS and SPARC,
+the special value -4 instructs the loader to generate no PC save
+and restore instructions, even if the function is not a leaf.
+Here is a complete procedure that returns the sum
+of its two arguments:
+.P1
+TEXT sum(SB), $0
+ MOVL arg1+0(FP), R0
+ ADDL arg2+4(FP), R0
+ RTS
+.P2
+An optional middle argument
+to the
+.CW TEXT
+pseudo-op is a bit field of options to the loader.
+Setting the 1 bit suspends profiling the function when profiling is enabled for the rest of
+the program.
+For example,
+.P1
+TEXT sum(SB), 1, $0
+ MOVL arg1+0(FP), R0
+ ADDL arg2+4(FP), R0
+ RTS
+.P2
+will not be profiled; the first version above would be.
+Subroutines with peculiar state, such as system call routines,
+should not be profiled.
+.PP
+Setting the 2 bit allows multiple definitions of the same
+.CW TEXT
+symbol in a program; the loader will place only one such function in the image.
+It was emitted only by the Alef compilers.
+.PP
+Subroutines to be called from C should place their result in
+.CW R0 ,
+even if it is an address.
+Floating point values are returned in
+.CW F0 .
+Functions that return a structure to a C program
+receive as their first argument the address of the location to
+store the result;
+.CW R0
+is unused in the calling protocol for such procedures.
+A subroutine is responsible for saving its own registers,
+and therefore is free to use any registers without saving them (``caller saves'').
+.CW A6
+and
+.CW A7
+are the exceptions as described above.
+.SH
+When in doubt
+.PP
+If you get confused, try using the
+.CW -S
+option to
+.CW 2c
+and compiling a sample program.
+The standard output is valid input to the assembler.
+.SH
+Instructions
+.PP
+The instruction set of the assembler is not identical to that
+of the machine.
+It is chosen to match what the compiler generates, augmented
+slightly by specific needs of the operating system.
+For example,
+.CW 2a
+does not distinguish between the various forms of
+.CW MOVE
+instruction: move quick, move address, etc. Instead the context
+does the job. For example,
+.P1
+ MOVL $1, R1
+ MOVL A0, R2
+ MOVW SR, R3
+.P2
+generates official
+.CW MOVEQ ,
+.CW MOVEA ,
+and
+.CW MOVESR
+instructions.
+A number of instructions do not have the syntax necessary to specify
+their entire capabilities. Notable examples are the bitfield
+instructions, the
+multiply and divide instructions, etc.
+For a complete set of generated instruction names (in
+.CW 2a
+notation, not Motorola's) see the file
+.CW /sys/src/cmd/2c/2.out.h .
+Despite its name, this file contains an enumeration of the
+instructions that appear in the intermediate files generated
+by the compiler, which correspond exactly to lines of assembly language.
+.PP
+The MC68000 assembler,
+.CW 1a ,
+is essentially the same, honoring the appropriate subset of the instructions
+and addressing modes.
+The definitions of these are, nonetheless, part of
+.CW 2.out.h .
+.SH
+Laying down instructions
+.PP
+The loader modifies the code produced by the assembler and compiler.
+It folds branches,
+copies short sequences of code to eliminate branches,
+and discards unreachable code.
+The first instruction of every function is assumed to be reachable.
+The pseudo-instruction
+.CW NOP ,
+which you may see in compiler output,
+means no instruction at all, rather than an instruction that does nothing.
+The loader discards all
+.CW NOP 's.
+.PP
+To generate a true
+.CW NOP
+instruction, or any other instruction not known to the assembler, use a
+.CW WORD
+pseudo-instruction.
+Such instructions on RISCs are not scheduled by the loader and must have
+their delay slots filled manually.
+.SH
+MIPS
+.PP
+The registers are only addressed by number:
+.CW R0
+through
+.CW R31 .
+.CW R29
+is the stack pointer;
+.CW R30
+is used as the static base pointer, the analogue of
+.CW A6
+on the 68020.
+Its value is the address of the global symbol
+.CW setR30(SB) .
+The register holding returned values from subroutines is
+.CW R1 .
+When a function is called, space for the first argument
+is reserved at
+.CW 0(FP)
+but in C (not Alef) the value is passed in
+.CW R1
+instead.
+.PP
+The loader uses
+.CW R28
+as a temporary. The system uses
+.CW R26
+and
+.CW R27
+as interrupt-time temporaries. Therefore none of these registers
+should be used in user code.
+.PP
+The control registers are not known to the assembler.
+Instead they are numbered registers
+.CW M0 ,
+.CW M1 ,
+etc.
+Use this trick to access, say,
+.CW STATUS :
+.P1
+#define STATUS 12
+ MOVW M(STATUS), R1
+.P2
+.PP
+Floating point registers are called
+.CW F0
+through
+.CW F31 .
+By convention,
+.CW F24
+must be initialized to the value 0.0,
+.CW F26
+to 0.5,
+.CW F28
+to 1.0, and
+.CW F30
+to 2.0;
+this is done by the operating system.
+.PP
+The instructions and their syntax are different from those of the manufacturer's
+manual.
+There are no
+.CW lui
+and kin; instead there are
+.CW MOVW
+(move word),
+.CW MOVH
+(move halfword),
+and
+.CW MOVB
+(move byte) pseudo-instructions. If the operand is unsigned, the instructions
+are
+.CW MOVHU
+and
+.CW MOVBU .
+The order of operands is from left to right in dataflow order, just as
+on the 68020 but not as in MIPS documentation.
+This means that the
+.CW Bcond
+instructions are reversed with respect to the book; for example, a
+.CW va
+.CW BGTZ
+generates a MIPS
+.CW bltz
+instruction.
+.PP
+The assembler is for the R2000, R3000, and most of the R4000 and R6000 architectures.
+It understands the 64-bit instructions
+.CW MOVV ,
+.CW MOVVL ,
+.CW ADDV ,
+.CW ADDVU ,
+.CW SUBV ,
+.CW SUBVU ,
+.CW MULV ,
+.CW MULVU ,
+.CW DIVV ,
+.CW DIVVU ,
+.CW SLLV ,
+.CW SRLV ,
+and
+.CW SRAV .
+The assembler does not have any cache, load-linked, or store-conditional instructions.
+.PP
+Some assembler instructions are expanded into multiple instructions by the loader.
+For example the loader may convert the load of a 32 bit constant into an
+.CW lui
+followed by an
+.CW ori .
+.PP
+Assembler instructions should be laid out as if there
+were no load, branch, or floating point compare delay slots;
+the loader will rearrange\(em\f2schedule\f1\(emthe instructions
+to guarantee correctness and improve performance.
+The only exception is that the correct scheduling of instructions
+that use control registers varies from model to model of machine
+(and is often undocumented) so you should schedule such instructions
+by hand to guarantee correct behavior.
+The loader generates
+.P1
+ NOR R0, R0, R0
+.P2
+when it needs a true no-op instruction.
+Use exactly this instruction when scheduling code manually;
+the loader recognizes it and schedules the code before it and after it independently. Also,
+.CW WORD
+pseudo-ops are scheduled like no-ops.
+.PP
+The
+.CW NOSCHED
+pseudo-op disables instruction scheduling
+(scheduling is enabled by default);
+.CW SCHED
+re-enables it.
+Branch folding, code copying, and dead code elimination are
+disabled for instructions that are not scheduled.
+.SH
+SPARC
+.PP
+Once you understand the Plan 9 model for the MIPS, the SPARC is familiar.
+Registers have numerical names only:
+.CW R0
+through
+.CW R31 .
+Forget about register windows: Plan 9 doesn't use them at all.
+The machine has 32 global registers, period.
+.CW R1
+[sic] is the stack pointer.
+.CW R2
+is the static base register, with value the address of
+.CW setSB(SB) .
+.CW R7
+is the return register and also the register holding the first
+argument to a C (not Alef) function, again with space reserved at
+.CW 0(FP) .
+.CW R14
+is the loader temporary.
+.PP
+Floating-point registers are exactly as on the MIPS.
+.PP
+The control registers are known by names such as
+.CW FSR .
+The instructions to access these registers are
+.CW MOVW
+instructions, for example
+.P1
+ MOVW Y, R8
+.P2
+for the SPARC instruction
+.P1
+ rdy %r8
+.P2
+.PP
+Move instructions are similar to those on the MIPS: pseudo-operations
+that turn into appropriate sequences of
+.CW sethi
+instructions, adds, etc.
+Instructions read from left to right. Because the arguments are
+flipped to
+.CW SUBCC ,
+the condition codes are not inverted as on the MIPS.
+.PP
+The syntax for the ASI stuff is, for example to move a word from ASI 2:
+.P1
+ MOVW (R7, 2), R8
+.P2
+The syntax for double indexing is
+.P1
+ MOVW (R7+R8), R9
+.P2
+.PP
+The SPARC's instruction scheduling is similar to the MIPS's.
+The official no-op instruction is:
+.P1
+ ORN R0, R0, R0
+.P2
+.SH
+i960
+.PP
+Registers are numbered
+.CW R0
+through
+.CW R31 .
+Stack pointer is
+.CW R29 ;
+return register is
+.CW R4 ;
+static base is
+.CW R28 ;
+it is initialized to the address of
+.CW setSB(SB) .
+.CW R3
+must be zero; this should be done manually early in execution by
+.P1
+ SUBO R3, R3
+.P2
+.CW R27
+is the loader temporary.
+.PP
+There is no support for floating point.
+.PP
+The Intel calling convention is not supported and cannot be used; use
+.CW BAL
+instead.
+Instructions are mostly as in the book. The major change is that
+.CW LOAD
+and
+.CW STORE
+are both called
+.CW MOV .
+The extension character for
+.CW MOV
+is as in the manual:
+.CW O
+for ordinal,
+.CW W
+for signed, etc.
+.SH
+i386
+.PP
+The assembler assumes 32-bit protected mode.
+The register names are
+.CW SP ,
+.CW AX ,
+.CW BX ,
+.CW CX ,
+.CW DX ,
+.CW BP ,
+.CW DI ,
+and
+.CW SI .
+The stack pointer (not a pseudo-register) is
+.CW SP
+and the return register is
+.CW AX .
+There is no physical frame pointer but, as for the MIPS,
+.CW FP
+is a pseudo-register that acts as
+a frame pointer.
+.PP
+Opcode names are mostly the same as those listed in the Intel manual
+with an
+.CW L ,
+.CW W ,
+or
+.CW B
+appended to identify 32-bit,
+16-bit, and 8-bit operations.
+The exceptions are loads, stores, and conditionals.
+All load and store opcodes to and from general registers, special registers
+(such as
+.CW CR0,
+.CW CR3,
+.CW GDTR,
+.CW IDTR,
+.CW SS,
+.CW CS,
+.CW DS,
+.CW ES,
+.CW FS,
+and
+.CW GS )
+or memory are written
+as
+.P1
+ MOV\f2x\fP src,dst
+.P2
+where
+.I x
+is
+.CW L ,
+.CW W ,
+or
+.CW B .
+Thus to get
+.CW AL
+use a
+.CW MOVB
+instruction. If you need to access
+.CW AH ,
+you must mention it explicitly in a
+.CW MOVB :
+.P1
+ MOVB AH, BX
+.P2
+There are many examples of illegal moves, for example,
+.P1
+ MOVB BP, DI
+.P2
+that the loader actually implements as pseudo-operations.
+.PP
+The names of conditions in all conditional instructions
+.CW J , (
+.CW SET )
+follow the conventions of the 68020 instead of those of the Intel
+assembler:
+.CW JOS ,
+.CW JOC ,
+.CW JCS ,
+.CW JCC ,
+.CW JEQ ,
+.CW JNE ,
+.CW JLS ,
+.CW JHI ,
+.CW JMI ,
+.CW JPL ,
+.CW JPS ,
+.CW JPC ,
+.CW JLT ,
+.CW JGE ,
+.CW JLE ,
+and
+.CW JGT
+instead of
+.CW JO ,
+.CW JNO ,
+.CW JB ,
+.CW JNB ,
+.CW JZ ,
+.CW JNZ ,
+.CW JBE ,
+.CW JNBE ,
+.CW JS ,
+.CW JNS ,
+.CW JP ,
+.CW JNP ,
+.CW JL ,
+.CW JNL ,
+.CW JLE ,
+and
+.CW JNLE .
+.PP
+The addressing modes have syntax like
+.CW AX ,
+.CW (AX) ,
+.CW (AX)(BX*4) ,
+.CW 10(AX) ,
+and
+.CW 10(AX)(BX*4) .
+The offsets from
+.CW AX
+can be replaced by offsets from
+.CW FP
+or
+.CW SB
+to access names, for example
+.CW extern+5(SB)(AX*2) .
+.PP
+Other notes: Non-relative
+.CW JMP
+and
+.CW CALL
+have a
+.CW *
+added to the syntax.
+Only
+.CW LOOP ,
+.CW LOOPEQ ,
+and
+.CW LOOPNE
+are legal loop instructions. Only
+.CW REP
+and
+.CW REPN
+are recognized repeaters. These are not prefixes, but rather
+stand-alone opcodes that precede the strings, for example
+.P1
+ CLD; REP; MOVSL
+.P2
+Segment override prefixes in
+.CW MOD/RM
+fields are not supported.
+.SH
+AMD64
+.PP
+The assembler assumes 64-bit mode unless a
+.CW MODE
+pseudo-operation is given:
+.P1
+ MODE $32
+.P2
+to change to 32-bit mode.
+The effect is mainly to diagnose instructions that are illegal in
+the given mode, but the loader will also assume 32-bit operands and addresses,
+and 32-bit PC values for call and return.
+The assembler's conventions are similar to those for the 386, above.
+The architecture provides extra fixed-point registers
+.CW R8
+to
+.CW R15 .
+All registers are 64 bit, but instructions access low-order 8, 16 and 32 bits
+as described in the processor handbook.
+For example,
+.CW MOVL
+to
+.CW AX
+puts a value in the low-order 32 bits and clears the top 32 bits to zero.
+Literal operands are limited to signed 32 bit values, which are sign-extended
+to 64 bits in 64 bit operations; the exception is
+.CW MOVQ ,
+which allows 64-bit literals.
+The external registers in Plan 9's C are allocated from
+.CW R15
+down.
+There are many new instructions, including the MMX and XMM media instructions,
+and conditional move instructions.
+MMX registers are
+.CW M0
+to
+.CW M7 ,
+and
+XMM registers are
+.CW X0
+to
+.CW X15 .
+As with the 386 instruction names,
+all new 64-bit integer instructions, and the MMX and XMM instructions
+uniformly use
+.CW L
+for `long word' (32 bits) and
+.CW Q
+for `quad word' (64 bits).
+Some instructions use
+.CW O
+(`octword') for 128-bit values, where the processor handbook
+variously uses
+.CW O
+or
+.CW DQ .
+The assembler also consistently uses
+.CW PL
+for `packed long' in
+XMM instructions, instead of
+.CW Q ,
+.CW DQ
+or
+.CW PI .
+Either
+.CW MOVL
+or
+.CW MOVQ
+can be used to move values to and from control registers, even when
+the registers might be 64 bits.
+The assembler often accepts the handbook's name to ease conversion
+of existing code (but remember that the operand order is uniformly
+source then destination).
+C's
+.CW "long long"
+type is 64 bits, but passed and returned by value, not by reference.
+More notably, C pointer values are 64 bits, and thus
+.CW "long long"
+and
+.CW "unsigned long long"
+are the only integer types wide enough to hold a pointer value.
+The C compiler and library use the XMM floating-point instructions, not
+the old 387 ones, although the latter are implemented by assembler and loader.
+Unlike the 386, the first integer or pointer argument is passed in a register, which is
+.CW BP
+for an integer or pointer (it can be referred to in assembly code by the pseudonym
+.CW RARG ).
+.CW AX
+holds the return value from subroutines as before.
+Floating-point results are returned in
+.CW X0 ,
+although currently the first floating-point parameter is not passed in a register.
+All parameters less than 8 bytes in length have 8 byte slots reserved on the stack
+to preserve alignment and simplify variable-length argument list access,
+including the first parameter when passed in a register,
+even though bytes 4 to 7 are not initialized.
+.SH
+Alpha
+.PP
+On the Alpha, all registers are 64 bits. The architecture handles 32-bit values
+by giving them a canonical format (sign extension in the case of integer registers).
+Registers are numbered
+.CW R0
+through
+.CW R31 .
+.CW R0
+holds the return value from subroutines, and also the first parameter.
+.CW R30
+is the stack pointer,
+.CW R29
+is the static base,
+.CW R26
+is the link register, and
+.CW R27
+and
+.CW R28
+are linker temporaries.
+.PP
+Floating point registers are numbered
+.CW F0
+to
+.CW F31 .
+.CW F28
+contains
+.CW 0.5 ,
+.CW F29
+contains
+.CW 1.0 ,
+and
+.CW F30
+contains
+.CW 2.0 .
+.CW F31
+is always
+.CW 0.0
+on the Alpha.
+.PP
+The extension character for
+.CW MOV
+follows DEC's notation:
+.CW B
+for byte (8 bits),
+.CW W
+for word (16 bits),
+.CW L
+for long (32 bits),
+and
+.CW Q
+for quadword (64 bits).
+Byte and ``word'' loads and stores may be made unsigned
+by appending a
+.CW U .
+.CW S
+and
+.CW T
+refer to IEEE floating point single precision (32 bits) and double precision (64 bits), respectively.
+.SH
+Power PC
+.PP
+The Power PC follows the Plan 9 model set by the MIPS and SPARC,
+not the elaborate ABIs.
+The 32-bit instructions of the 60x and 8xx PowerPC architectures are supported;
+there is no support for the older POWER instructions.
+Registers are
+.CW R0
+through
+.CW R31 .
+.CW R0
+is initialized to zero; this is done by C start up code
+and assumed by the compiler and loader.
+.CW R1
+is the stack pointer.
+.CW R2
+is the static base register, with value the address of
+.CW setSB(SB) .
+.CW R3
+is the return register and also the register holding the first
+argument to a C function, with space reserved at
+.CW 0(FP)
+as on the MIPS.
+.CW R31
+is the loader temporary.
+The external registers in Plan 9's C are allocated from
+.CW R30
+down.
+.PP
+Floating point registers are called
+.CW F0
+through
+.CW F31 .
+By convention, several registers are initialized
+to specific values; this is done by the operating system.
+.CW F27
+must be initialized to the value
+.CW 0x4330000080000000
+(used by float-to-int conversion),
+.CW F28
+to the value 0.0,
+.CW F29
+to 0.5,
+.CW F30
+to 1.0, and
+.CW F31
+to 2.0.
+.PP
+As on the MIPS and SPARC, the assembler accepts arbitrary literals
+as operands to
+.CW MOVW ,
+and also to
+.CW ADD
+and others where `immediate' variants exist,
+and the loader generates sequences
+of
+.CW addi ,
+.CW addis ,
+.CW oris ,
+etc. as required.
+The register indirect addressing modes use the same syntax as the SPARC,
+including double indexing when allowed.
+.PP
+The instruction names are generally derived from the Motorola ones,
+subject to slight transformation:
+the
+.CW . ' `
+marking the setting of condition codes is replaced by
+.CW CC ,
+and when the letter
+.CW o ' `
+represents `OE=1' it is replaced by
+.CW V .
+Thus
+.CW add ,
+.CW addo.
+and
+.CW subfzeo.
+become
+.CW ADD ,
+.CW ADDVCC
+and
+.CW SUBFZEVCC .
+As well as the three-operand conditional branch instruction
+.CW BC ,
+the assembler provides pseudo-instructions for the common cases:
+.CW BEQ ,
+.CW BNE ,
+.CW BGT ,
+.CW BGE ,
+.CW BLT ,
+.CW BLE ,
+.CW BVC ,
+and
+.CW BVS .
+The unconditional branch instruction is
+.CW BR .
+Indirect branches use
+.CW "(CTR)"
+or
+.CW "(LR)"
+as target.
+.PP
+Load or store operations are replaced by
+.CW MOV
+variants in the usual way:
+.CW MOVW
+(move word),
+.CW MOVH
+(move halfword with sign extension), and
+.CW MOVB
+(move byte with sign extension, a pseudo-instruction),
+with unsigned variants
+.CW MOVHZ
+and
+.CW MOVBZ ,
+and byte-reversing
+.CW MOVWBR
+and
+.CW MOVHBR .
+`Load or store with update' versions are
+.CW MOVWU ,
+.CW MOVHU ,
+and
+.CW MOVBZU .
+Load or store multiple is
+.CW MOVMW .
+The exceptions are the string instructions, which are
+.CW LSW
+and
+.CW STSW ,
+and the reservation instructions
+.CW lwarx
+and
+.CW stwcx. ,
+which are
+.CW LWAR
+and
+.CW STWCCC ,
+all with operands in the usual data-flow order.
+Floating-point load or store instructions are
+.CW FMOVD ,
+.CW FMOVDU ,
+.CW FMOVS ,
+and
+.CW FMOVSU .
+The register to register move instructions
+.CW fmr
+and
+.CW fmr.
+are written
+.CW FMOVD
+and
+.CW FMOVDCC .
+.PP
+The assembler knows the commonly used special purpose registers:
+.CW CR ,
+.CW CTR ,
+.CW DEC ,
+.CW LR ,
+.CW MSR ,
+and
+.CW XER .
+The rest, which are often architecture-dependent, are referenced as
+.CW SPR(n) .
+The segment registers of the 60x series are similarly
+.CW SEG(n) ,
+but
+.I n
+can also be a register name, as in
+.CW SEG(R3) .
+Moves between special purpose registers and general purpose ones,
+when allowed by the architecture,
+are written as
+.CW MOVW ,
+replacing
+.CW mfcr ,
+.CW mtcr ,
+.CW mfmsr ,
+.CW mtmsr ,
+.CW mtspr ,
+.CW mfspr ,
+.CW mftb ,
+and many others.
+.PP
+The fields of the condition register
+.CW CR
+are referenced as
+.CW CR(0)
+through
+.CW CR(7) .
+They are used by the
+.CW MOVFL
+(move field) pseudo-instruction,
+which produces
+.CW mcrf
+or
+.CW mtcrf .
+For example:
+.P1
+ MOVFL CR(3), CR(0)
+ MOVFL R3, CR(1)
+ MOVFL R3, $7, CR
+.P2
+They are also accepted in
+the conditional branch instruction, for example
+.P1
+ BEQ CR(7), label
+.P2
+Fields of the
+.CW FPSCR
+are accessed using
+.CW MOVFL
+in a similar way:
+.P1
+ MOVFL FPSCR, F0
+ MOVFL F0, FPSCR
+ MOVFL F0, $7, FPSCR
+ MOVFL $0, FPSCR(3)
+.P2
+producing
+.CW mffs ,
+.CW mtfsf
+or
+.CW mtfsfi ,
+as appropriate.
+.SH
+ARM
+.PP
+The assembler provides access to
+.CW R0
+through
+.CW R14
+and the
+.CW PC .
+The stack pointer is
+.CW R13 ,
+the link register is
+.CW R14 ,
+and the static base register is
+.CW R12 .
+.CW R0
+is the return register and also the register holding
+the first argument to a subroutine.
+The assembler supports the
+.CW CPSR
+and
+.CW SPSR
+registers.
+It also knows about coprocessor registers
+.CW C0
+through
+.CW C15 .
+Floating registers are
+.CW F0
+through
+.CW F7 ,
+.CW FPSR
+and
+.CW FPCR .
+.PP
+As with the other architectures, loads and stores are called
+.CW MOV ,
+e.g.
+.CW MOVW
+for load word or store word, and
+.CW MOVM
+for
+load or store multiple,
+depending on the operands.
+.PP
+Addressing modes are supported by suffixes to the instructions:
+.CW .IA
+(increment after),
+.CW .IB
+(increment before),
+.CW .DA
+(decrement after), and
+.CW .DB
+(decrement before).
+These can only be used with the
+.CW MOV
+instructions.
+The move multiple instruction,
+.CW MOVM ,
+defines a range of registers using brackets, e.g.
+.CW [R0-R12] .
+The special
+.CW MOVM
+addressing mode bits
+.CW W ,
+.CW U ,
+and
+.CW P
+are written in the same manner, for example,
+.CW MOVM.DB.W .
+A
+.CW .S
+suffix allows a
+.CW MOVM
+instruction to access user
+.CW R13
+and
+.CW R14
+when in another processor mode.
+Shifts and rotates in addressing modes are supported by binary operators
+.CW <<
+(logical left shift),
+.CW >>
+(logical right shift),
+.CW ->
+(arithmetic right shift), and
+.CW @>
+(rotate right); for example
+.CW "R7>>R2" or
+.CW "R2@>2" .
+The assembler does not support indexing by a shifted expression;
+only names can be doubly indexed.
+.PP
+Any instruction can be followed by a suffix that makes the instruction conditional:
+.CW .EQ ,
+.CW .NE ,
+and so on, as in the ARM manual, with synonyms
+.CW .HS
+(for
+.CW .CS )
+and
+.CW .LO
+(for
+.CW .CC ),
+for example
+.CW ADD.NE .
+Arithmetic
+and logical instructions
+can have a
+.CW .S
+suffix, as ARM allows, to set condition codes.
+.PP
+The syntax of the
+.CW MCR
+and
+.CW MRC
+coprocessor instructions is largely as in the manual, with the usual adjustments.
+The assembler directly supports only the ARM floating-point coprocessor
+operations used by the compiler:
+.CW CMP ,
+.CW ADD ,
+.CW SUB ,
+.CW MUL ,
+and
+.CW DIV ,
+all with
+.CW F
+or
+.CW D
+suffix selecting single or double precision.
+Floating-point load or store become
+.CW MOVF
+and
+.CW MOVD .
+Conversion instructions are also specified by moves:
+.CW MOVWD ,
+.CW MOVWF ,
+.CW MOVDW ,
+.CW MOVWD ,
+.CW MOVFD ,
+and
+.CW MOVDF .
+.SH
+AMD 29000
+.PP
+For details about this assembly language, which was built for the AMD 29240,
+look at the sources or examine compiler output.