# -------------------------------------------------------------------
# INSTRUCTION SET                       (c) Copyright 1996 Nat! & KKP
# -------------------------------------------------------------------
# These are some of the results/guesses that Klaus and Nat! found
# out about the Jaguar. Since we are not under NDA or anything from
# Atari we feel free to give this to you for educational purposes
# only.
# Thanks to NEUROMANCER for most of the information contained in
# here.
#
# Please note, that this is not official documentation from Atari
# or derived work thereof (both of us have never seen the Atari docs)
# and Atari isn't connected with this in any way.
#
# Please use this informationphile as a starting point for your own
# exploration and not as a reference. If you find anything inaccurate,
# missing, needing more explanation etc. by all means please write
# to us:
#    nat@zumdick.ruhr.de
# or
#    kp@eegholm.dk
#
# If you could do us a small favor, don't use this information for
# those lame flamewars on r.g.v.a or the mailing list.
#
# HTML soon ?
# -------------------------------------------------------------------
# $Id: table.html,v 1.9 1997/11/16 18:14:42 nat Exp $                
# -------------------------------------------------------------------
This is not complete!!

Assume for the following code that n, c, z are the GPU flags, which we'll just give the type 'flag', which is an unsigned kind of integer (1 bit long). Rn will be of type 'slword', and assumed to be 32 bit long. The instruction operations is given in "C-code" (hopefully correct). For people who don't know C:

   x & y       x logical and y
   x | y       x logical or y
   x ^ y       x logical eor y
   x != y      x not equal to y   Result:  0 for equal, 1 for not equal
   x == y      x equal to y
   x << y      shift x left y times
   x >> y      shift x right y times
   x ? y : z   if x then y else z
   (lword) x   x is treated as an lword
the rest should be straightforward.
slword: signed 32 bit
lword:  unsigned 32 bit
flag:   unsigned 1 bit

flag     z, n, c;    /* three status flags            */
slword   Rn, Rm;     /* two general purpose registers */
slword   acc;        /* internal accumulator register */

Instructions


ABS   Rn:  
~~~~~~~~~  
   16     10    5     0
   +------+-----+-----+
   |010110|00000|nnnnn|
   +------+-----+-----+
     (22)
   n  = 0
   c  = Rn < 0
   Rn = (lword) Rn > 0x80000000 ? -Rn : Rn;
   z  = Rn == 0

   Takes the absolute of the 32 bit twos-complement value. There's a bug,
   that 0x80000000 is not handled correctly.
   
   Examples:
      0xFFFFFFF -> 0x00000001
      0x7FFFFFF -> 0x7FFFFFFF
      0x8000000 -> 0x80000000  (special case)


ADD   Rm,Rn:
~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |000000|mmmmm|nnnnn|
   +------+-----+-----+
     (0)
   c   = (lword) (Rn + Rm) < (lword) Rn
   Rn += Rm
   z   = Rn == 0
   n   = Rn < 0

   Just adds both registers. 

   

ADDC  Rm,Rn:
~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |000001|mmmmm|nnnnn|
   +------+-----+-----+
      (1)

   c   = (lword) (Rn - Rm + c) < (lword) Rn
   Rn += Rm + c
   z   = Rn == 0
   n   = Rn < 0

   Just adds both registers plus the carry flag, that might have been
   leftover from a previous addition.

   Example:    
   ;  64 bit add:  R0/R1 LSL/MSL of x
   ;               R2/R3 LSL/MSL of y
   ; x += y
       add  r2,r0
       addc r3,r1




ADDQ  #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |000010|iiiii|nnnnn|
   +------+-----+-----+
      (2)
   c   = (lword) (immediate ? immediate : 32) + Rn < (lword) Rn
   Rn += immediate ? immediate : 32
   z   = Rn == 0
   n   = Rn < 0

   Add with immediate data value contained in the instruction. The 
   immediate value can be in the range from 1 to 32.



ADDQMOD  #immediate,Rn:    DSP ONLY
~~~~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |111111|iiiii|nnnnn|
   +------+-----+-----+
     (63)
   c   = (lword) (immediate ? immediate : 32) + Rn < (lword) Rn
   Rn  = (Rn + (immediate ? immediate : 32)) & MODULO;
   z   = Rn == 0
   n   = Rn < 0

   Add with immediate data value contained in the instruction. The 
   immediate value can be in the range from 1 to 32. The result is 
   finally ANDed with the contents of the modulo register. You can
   easily setup a circular buffer this way:

      movei    #D_MODULO,r10     ;; address of MODULO register
      movei    #buffer,r11       ;; address of our circular buffer
      movei    #buffer_len,r12   ;; size in bytes of our buffer
      subq     #1,r12            ;; prepare for register
      not      r12               
      store    r12,(r10)         ;; set it up

   loop:
      addqmod  #2,r11            ;; go round in circles
      jr       t,loop




ADDQT #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |000011|iiiii|nnnnn|
   +------+-----+-----+
      (3)
   Rn += immediate ? immediate : 32

   Like ADDQ except that the status flags aren't
   affected.



AND   Rm,Rn:
~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |001001|mmmmm|nnnnn|
   +------+-----+-----+
      (9)
   Rn &= Rm
   z   = Rn == 0
   n   = Rn < 0

   Bitwise AND of the two registers. 

   Examples:
      0xAACC3355 & 0xFFFFFFFF -> 0xAACC3355
      0xAACC3355 & 0x00000000 -> 0x00000000
      0xAACC3355 & 0xFF00FF00 -> 0xAA003300

   

BCLR  #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |001111|iiiii|nnnnn|
   +------+-----+-----+
     (15)
   Rn &= ~(1UL << immediate)
   z   = Rn == 0
   n   = Rn < 0
   c   = ?

   Clears the specified bit in the designated register. Bits are numbered
   from least significant bit to most significant bit.

   Example:
      MOVEI #$FFFFFFFF,r0     ; R0: 0xFFFFFFFF
      BCLR  #0,r0             ; R0: 0xFFFFFFFE
      BCLR  #31,r0            ; R0: 0x7FFFFFFE



BSET  #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |001110|iiiii|nnnnn|
   +------+-----+-----+
     (14)
   Rn |= (1UL << immediate)
   z   = Rn == 0
   n   = Rn < 0
   c   = ?

   Sets the specified bit in the designated register. Bits are numbered
   from least significant bit to most significant bit.

   Example:
      SUB   r0,r0             ; R0: 0x00000000
      BSET  #0,r0             ; R0: 0x00000001
      BSET  #31,r0            ; R0: 0x80000001



BTST  #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |001101|iiiii|nnnnn|
   +------+-----+-----+
     (13)
   z  = ! (Rn & (1UL << immediate))
   n  = Rn < 0
   c  = ?

   Sets the status register flags according to the state of the specified bit 
   in the register. 



CMP   Rm,Rn:
~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |011110|mmmmm|nnnnn|
   +------+-----+-----+
     (30)
   c   = (lword) (Rn - Rm) > (lword) Rn
   z   = (Rn - Rm) == 0
   n   = (Rn - Rm) < 0

   Compares Rm to Rn. This is just the same as a subtraction
   except that the registers aren't modified.



CMPQ  #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |011111|iiiii|nnnnn|
   +------+-----+-----+
     (31)
   immediate is signextended!

   c   = (lword) (Rn - (immediate > 16 ? 0xFFFFFFF0 + immediate : immediate))
         > (lword) Rn
   z   = (Rn - Rm) == 0
   n   = (Rn - Rm) < 0

   Compares with an immediate value in the range of -16 to +15. You can compare
   with zero.



DIV  Rm,Rn:
~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |010101|mmmmm|nnnnn|
   +------+-----+-----+
     (21)
   REMAIN = Rn - ((Rn / Rm) * Rm)
   Rn    /= Rm

   Divide two 32 bit registers. The remainder will be available in the
   REMAIN register. 

   

IMACN Rm,Rn:
~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |010100|mmmmm|nnnnn|
   +------+-----+-----+
     (20)
   ACC += (Rm | (Rm & 0x8000) ? 0xFFFF0000 : 0)) *
          (Rn | (Rn & 0x8000) ? 0xFFFF0000 : 0))
   z    = Rn == 0
   n    = Rn < 0

   Both registers bottom 16 bits are used in the multiplication code
   only. This is a signed multiply and add. The result of the multiplication
   is added to an internal register, which is only accessible via 
   RESMAC



IMULT Rm,Rn:
~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |010001|mmmmm|nnnnn|
   +------+-----+-----+
     (17)

   Rn = (Rm | (Rm & 0x8000) ? 0xFFFF0000 : 0)) *
        (Rn | (Rn & 0x8000) ? 0xFFFF0000 : 0))
   z  = Rn == 0
   n  = Rn < 0

   Both registers bottom 16 bits are used in the multiplication code
   only. This is a signed multiply.



IMULTN Rm,Rn:
~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |010010|mmmmm|nnnnn|
   +------+-----+-----+
     (18)

   ACC = ((Rm & 0xFFFF) | 0xFFFF0000) * ((Rn & 0xFFFF) | 0xFFFF0000)
   z   = Rn == 0
   n   = Rn < 0

   Both registers bottom 16 bits are used in the multiplication code
   only. This is a signed multiply. The result is stored in an 
   internal register. (see RESMAC)



JR    cc,relative:
~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110101|ccccc|nnnnn|
   +------+-----+-----+
     (53)
   c0:   z == 0
   c1:   z == 1
   c2:   c4 ? (n == 0) : (c == 0)
   c3:   c4 ? (n == 1) : (c == 1)
   
   if( (! c0 || z == 0) &&
       (! c1 || z == 1) &&
       (! c2 || (c4 ? (n == 0) : (c == 0))) &&
       (! c3 || (c4 ? (n == 1) : (c == 1))))
   {
      PC = PC + 2 + (immediate > 16 ? 0xFFFFFFF0 + immediate : immediate) * 2;
   }

   Because of the pipelined architecture the CPU will execute the
   instruction following the jump instruction before the jump has an effect.
   Therefore:

      sub      r0,r0       ;; clear r0
      jr       t,foo
      addqt    #1,r0

foo:
      ;; r0 will be 1

   Branch relative to the current program counter. There are a few condition
   code patterns that are of more use than others, namely:

   %00000:   T    always

   %00100:   CC   carry clear (less than)
   %01000:   CS   carry set   (greater or equal)
   %00010:   EQ   zero set (equal)
   %00001:   NE   zero clear (not equal) 
   %11000:   MI   negative set
   %10100:   PL   negative clear

   %00101:   HI   greater than

   


JUMP  cc,(Rn):
~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110100|ccccc|nnnnn|
   +------+-----+-----+
     (52)
   c0:   z == 0
   c1:   z == 1
   c2:   c4 ? (n == 0) : (c == 0)
   c3:   c4 ? (n == 1) : (c == 1)
   
   if( (! c0 || z == 0) &&
       (! c1 || z == 1) &&
       (! c2 || (c4 ? (n == 0) : (c == 0))) &&
       (! c3 || (c4 ? (n == 1) : (c == 1))))
   {
      PC = Rn
   }

   Jumps to the address contained in the Register.
   See JR for more details.
   
   Example:
   ;; endless loop
      movei    #routine,r0
routine:
      nop
      jump     t,(r0)
      nop




LOAD  (Rm),Rn:
~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |101001|mmmmm|nnnnn|
   +------+-----+-----+
     (41)
   if( Rm & 0x3)
      error( "bug");
   Rn = MEMORY[ Rm]

   Just fetches a long word from memory. The address should be long word
   aligned also. 



LOAD  (R14+m),Rn:
~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |101011|mmmmm|nnnnn|
   +------+-----+-----+
     (43)
   if( R14 + ((m ? m : 32) << 2) & 0x3)
      error( "bug");
   Rn = MEMORY[ R14 + ((m ? m : 32) << 2)]



LOAD  (R15+m),Rn:
~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |101100|mmmmm|nnnnn|
   +------+-----+-----+
     (44)
   if( R15 + ((m ? m : 32) << 2) & 0x3)
      error( "bug");
   Rn = MEMORY[ R15 + ((m ? m : 32) << 2)]



LOAD  (R14+Rm),Rn:
~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |111010|mmmmm|nnnnn|
   +------+-----+-----+
     (58)
   if( R14 + Rm & 0x3)
      error( "bug");
   Rn = MEMORY[ R14 + Rm]



LOAD  (R15+Rm),Rn:
~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |111011|mmmmm|nnnnn|
   +------+-----+-----+
     (59)

   if( R15 + Rm & 0x3)
      error( "bug");
   Rn = MEMORY[ R15 + Rm]



LOADB  (Rm),Rn:
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |100111|mmmmm|nnnnn|
   +------+-----+-----+
     (39)
   if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END)
      Rn = MEMORY[ Rm];
   else
      Rn = ((byte *) MEMORY)[ Rm];



LOADW  (Rm),Rn:
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |101000|mmmmm|nnnnn|
   +------+-----+-----+
     (40)
   if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END)
      Rn = MEMORY[ Rm];
   else
      Rn = ((word *) MEMORY)[ Rm];



LOADP  (Rm),Rn:      GPU ONLY
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |101010|mmmmm|nnnnn|
   +------+-----+-----+
     (42)
   if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END)
      Rn = MEMORY[ Rm];
   else
   {
      Rn     = MEMORY[ Rm];
      HIDATA = MEMORY[ Rm + 4];
   }



MIRROR    Rn:     DSP ONLY
~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110000|00000|nnnnn|
   +------+-----+-----+
     (48)
     
   z  = Rn == 0
   n  = Rn < 0 
   c  = ?

   Just flips all the bits around. bit 0 goes to bit 31, bit 1 to bit 30 etc.
   I am too lazy to figure out the C code at the moment.

   Supposedly not only useful for simple graphic tricks, but also for
   doing FFT operations (butterfly addressing I believe).

   Example:
      movei    #$A000010,r0     
      movei    #$0800005,r1
      mirror   r0
      sub      r1,r0             ; result will be zero


MMULT  Rm,Rn:     GPU ONLY
~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110110|mmmmm|nnnnn|
   +------+-----+-----+
     (54)

   Matrix multiplication



MOVE   Rm,Rn:
~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |100010|mmmmm|nnnnn|
   +------+-----+-----+
     (34)
   Rn = Rm



MOVE  PC,Rn:
~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110011|00000|nnnnn|
   +------+-----+-----+
     (51)
   This supposedly does take prefetching, and pipelining into account
   to give the 'true' PC for this instruction.

   Rn = PC



MOVEFA   Rm,Rn:
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |100101|mmmmm|nnnnn|
   +------+-----+-----+
     (37)
   Rn = OTHERBANK[ Rm]



MOVEI   #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~~~
   16     10    5     0  16                  0  16                  0
   +------+-----+-----+  +-------------------+  +-------------------+
   |100110|00000|nnnnn|  |       LSW         |  |        MSW        |
   +------+-----+-----+  +-------------------+  +-------------------+
     (38)
   Rn = LSW + ((lword) MSW << 16)



MOVEQ   #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~~~
   16     10    5     0  
   +------+-----+-----+  
   |100011|iiiii|nnnnn|  
   +------+-----+-----+
     (35)              
   Rn = immediate    

   Immediate values can range from 1-32. If you want to zero a register
   use the ever popular SUB Rn,Rn or XOR Rn,Rn



MOVETA   Rm,Rn:
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |100100|mmmmm|nnnnn|
   +------+-----+-----+
     (36)
   OTHERBANK[ Rn] = Rm



MTOI  Rm,Rn:
~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110111|mmmmm|nnnnn|
   +------+-----+-----+
     (55)
   ???

   z  = Rn == 0
   n  = Rn < 0
   c  = ?



MULT  Rm,Rn:
~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |010000|mmmmm|nnnnn|
   +------+-----+-----+
     (16)
   Rn = (lword) (Rm & 0xFFFF) * (lword) (Rn & 0xFFFF)
   z  = Rn == 0
   n  = (Rn & 0x80000000) != 0



NEG   Rn:
~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |001000|00000|nnnnn|
   +------+-----+-----+
     (8)

   Rn = 0 - Rn
   z  = Rn == 0
   n  = Rn < 0



NOP:
~~~
   16     10    5     0
   +------+-----+-----+
   |111001|00000|00000|
   +------+-----+-----+
     (57)



NORMI:
~~~~~
   16     10    5     0
   +------+-----+-----+
   |111000|mmmmm|nnnnn|
   +------+-----+-----+
     (56)

   z  = Rn == 0
   n  = Rn < 0
   
   Rn = 0;
   for( i = 31; i >= 0; i)
      if( Rm & (1UL << i))
      {
         Rn = Rm;
         break;
      }

   This works by returning a number which value is the position of the 
   most significant bit. Apparently useful for handling 32bit IEEE
   FP numbers. (hmm)



NOT   Rn:
~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |001100|00000|nnnnn|
   +------+-----+-----+
     (12)

   Rn = ~Rn
   z  = Rn == 0
   n  = Rn < 0



OR    Rm,Rn:
~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |001010|mmmmm|nnnnn|
   +------+-----+-----+
     (10)

   Rn |= Rm
   z   = Rn == 0
   n   = Rn < 0



PACK  Rn:      GPU ONLY
~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |111111|00000|nnnnn|
   +------+-----+-----+
     (63)   (0)

   Rn = ((Rn & 0x03C00000) >> 10) |
        ((Rn & 0x0001E000) >> 5)  |
        ((Rn & 0x000000FF))

   The idea behind this instruction and its companion UNPACK seems
   to be that you can do something like this:

         movei    #bitmap-2,r0            ; get a 256x256 Cry pixmap
         movei    #destination-2,r1       ; reduce it to 128x128

loop:
         addqt    #2,r0 
         loadw    (r0),r2                 ; fetch a pixel
         addqt    #2,r0
         loadw    (r0),r3                 ; and a second pixel
         unpack   r2                      ; get into "addable" form
         unpack   r3                      ; both
         add      r2,r3    
         lsr      #1,r3                   ; adjust back
         addqt    #2,r1
         storew   r3,(r1)
         
   Also I am not quite sure, what this will do to your colors. Probably
   this will work out nicely though.
   See UNPACK for some more details...
         


RESMAC  Rn:
~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |010011|00000|nnnnn|
   +------+-----+-----+
     (19)   

   Rn = ACC



ROR   Rm,Rn:
~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |011100|mmmmm|nnnnn|
   +------+-----+-----+
     (28)   

   c  = Rn & 0x80000000 != 0
   Rn = ((lword) Rn >> (Rm & 0x1F)) | ((lword) Rn << (32 - (Rm & 0x1F)))
   z  = Rn == 0
   n  = Rn < 0

   Since you can rotate to the left by using a rotation count of
   32 - left-rotates, there is no ROL or ROLQ opcode. Rotations are cheap, unlike
   the 68000. 



RORQ   #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |011101|iiiii|nnnnn|
   +------+-----+-----+
     (29)   

   c  = Rn & 0x80000000 != 0
   Rn = ((lword) Rn >> immediate)) | ((lword) Rn << (32 - immediate))
   z  = Rn == 0
   n  = Rn < 0



SAT8  Rn:         GPU ONLY
~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |100000|00000|nnnnn|
   +------+-----+-----+
     (32)   

   Rn = Rn < 0 ? 0 : (Rn > 0xFF ? 0xFF : Rn)
   z  = Rn == 0
   n  = 0
   c  = ?



SAT16  Rn:     GPU ONLY
~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |100001|00000|nnnnn|
   +------+-----+-----+
     (33)   

   Rn = (Rn < 0) ? 0 : (Rn > 0xFFFF ? 0xFFFF : Rn)
   z  = Rn == 0
   n  = 0
   c  = ?



SAT24  Rn:     GPU ONLY
~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |111110|00000|nnnnn|
   +------+-----+-----+
     (62)   

   Rn = (Rn < 0) ? 0 : (Rn > 0xFFFFFF ? 0xFFFFFF : Rn)
   z  = Rn == 0
   n  = 0
   c  = ?



SAT16S Rn:     DSP ONLY
~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |100001|00000|nnnnn|
   +------+-----+-----+
     (33)   

   Rn = (Rn < -0x7FFF) ? 0xFFFF8000 : (Rn > 0x7FFF ? 0x7FFF : Rn)
   z  = Rn == 0
   n  = 0
   c  = ?

   Force the register value to lie between -0x8000 and +0x7FFF



SAT32S Rn:     DSP ONLY
~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |101010|00000|nnnnn|
   +------+-----+-----+
     (42)   

   if( HIDATA != 0 && HIDATA != 0xFF)
   {
      if( HIDATA == 0x80)
         Rn = 0x80000000;
      else
         Rn = 0x7FFFFFFF;
   }

   z  = Rn == 0
   n  = 0
   c  = ?

   Use the lower 8 bits of the internal register HIDATA 
   together with Rn to form a 40 bit value. This is then saturated to a 32bit value.



SH    Rm,Rn:
~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |010111|mmmmm|nnnnn|
   +------+-----+-----+
     (23)   

   c  = Rm < 0 ? (Rn & 0x80000000 != 0) : (Rn & 0x1 != 0)
   Rn = Rm < 0 ? ((lword) Rn << -Rm) | ((lword) Rn >> Rm)
   z  = Rn == 0
   n  = Rn < 0



SHA   Rm,Rn:
~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |011010|mmmmm|nnnnn|
   +------+-----+-----+
     (26)   

   c  = Rm < 0 ? (Rn & 0x80000000 != 0) : (Rn & 0x1 != 0)
   Rn = Rm < 0 ? (Rn << -Rm) | (Rn >> Rm)
   z  = Rn == 0
   n  = Rn < 0



SHARQ   #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |011011|iiiii|nnnnn|
   +------+-----+-----+
     (27)   

   c  = Rn & 0x1 != 0
   Rn = Rn >> (immediate ? immediate : 32)
   z  = Rn == 0
   n  = Rn < 0

   Arithmetic shift right. The highest bit propagates down as you would
   expect in an integer shift. I.e. 0x80000000 shifted down 15 times will
   result in 0xFFFF0000. Values from 1 to 32 are OK.



SHLQ  #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |011000|iiiii|nnnnn|
   +------+-----+-----+
     (24)   

   c  = Rn & 0x80000000 != 0
   Rn = Rn << (32-immediate)
   z  = Rn == 0
   n  = Rn < 0



SHRQ    #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |011001|iiiii|nnnnn|
   +------+-----+-----+
     (25)   

   c  = Rn & 0x1 != 0
   Rn = (lword) Rn >> (immediate ? immediate : 32)
   z  = Rn == 0
   n  = Rn < 0

   The logical shift right. 0x80000000 shifted rite 16 times will
   yield 0x00008000.



STORE  Rn,(Rm):
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |101111|mmmmm|nnnnn|
   +------+-----+-----+
     (47)

   if( Rm & 0x3)
      error( "bug");
   MEMORY[ Rm] = Rn

   Store a register value into memory.



STORE  Rn,(R14+m):
~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110001|mmmmm|nnnnn|
   +------+-----+-----+
      (49)
   if( R14 + ((m ? m : 32) << 2) & 0x3)
      error( "bug");
   MEMORY[ R14 + ((m ? m : 32) << 2)] = Rn



STORE  Rn,(R15+m):
~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110010|mmmmm|nnnnn|
   +------+-----+-----+
     (50)

   if( R15 + ((m ? m : 32) << 2) & 0x3)
      error( "bug");
   MEMORY[ R15 + ((m ? m : 32) << 2)] = Rn



STORE  Rn,(R14+Rm):
~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |111100|mmmmm|nnnnn|
   +------+-----+-----+
     (60)

   if( R14 + Rm & 0x3)
      error( "bug");
   MEMORY[ R14 + Rm] = Rn



STORE  Rn,(R15+Rm):
~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |111101|mmmmm|nnnnn|
   +------+-----+-----+
     (61)

   if( R15 + Rm & 0x3)
      error( "bug");
   MEMORY[ R15 + Rm] = Rn



STOREB  Rn,(Rm):
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |101101|mmmmm|nnnnn|
   +------+-----+-----+
     (45)

   if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END)
      MEMORY[ Rm] = Rn
   else
      ((byte *) MEMORY)[ Rm] = Rn



STOREW  Rn,(Rm):
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |101110|mmmmm|nnnnn|
   +------+-----+-----+
     (46)
   if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END)
      MEMORY[ Rm] = Rn
   else
      ((word *) MEMORY)[ Rm] = Rn



STOREP  Rn,(Rm):        GPU ONLY
~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |110000|mmmmm|nnnnn|
   +------+-----+-----+
     (48)

   if( Rm >= INTERNAL_RAM_START && Rm < INTERNAL_RAM_END)
      MEMORY[ Rm] = Rn;
   else
   {
      MEMORY[ Rm] = Rn;
      MEMORY[ Rm + 4] = HIDATA;
   }

   Store 64 bit wholesale into memory. If you hit internal memory,
   this will be just a 32bit save. The upper 32 bits are solely
   put on the bus by the HIDATA register.



SUB   Rm,Rn:
~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |000100|mmmmm|nnnnn|
   +------+-----+-----+
      (4)

   c   = (lword) (Rn - Rm) < (lword) Rn
   Rn -= Rm
   z   = Rn == 0
   n   = Rn < 0



SUBC  Rm,Rn:
~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |000101|mmmmm|nnnnn|
   +------+-----+-----+
      (5)

   c   = (lword) (Rn - Rm - c) < (lword) Rn
   Rn -= Rm - c
   z   = Rn == 0
   n   = Rn < 0

   Subtract with carry.



SUBQ  #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |000110|iiiii|nnnnn|
   +------+-----+-----+
     (6)
   c   = (lword) (immediate ? immediate : 32) + Rn < (lword) Rn
   Rn -= immediate ? immediate : 32
   z   = Rn == 0
   n   = Rn < 0



SUBQMOD  #immediate,Rn:    DSP ONLY
~~~~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |100000|iiiii|nnnnn|
   +------+-----+-----+
     (32)
   c   = (lword) (immediate ? immediate : 32) + Rn < (lword) Rn
   Rn  = (Rn - (immediate ? immediate : 32)) & MODULO;
   z   = Rn == 0
   n   = Rn < 0

   Subtraction with the modulo register. Convenient for circular buffers.



SUBQT #immediate,Rn:
~~~~~~~~~~~~~~~~~~~~
   16     10    5     0
   +------+-----+-----+
   |000111|iiiii|nnnnn|
   +------+-----+-----+
     (7)
   Rn -= immediate ? immediate : 32


   Subtract but don't set the flag registers.



UNPACK  Rn:    GPU ONLY
~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |111111|00001|nnnnn|
   +------+-----+-----+
     (63)   (1)

   Rn = ((Rn & 0x0000F000) << 10) |
        ((Rn & 0x00000F00) << 5)  |
        ((Rn & 0x000000FF))

   Unpacks CrY pixels.
      
 32       28        24        20       16       12        8        4        0
  +--------^---------^---------^--------^--------^--------^--------^--------+
  |                unused               | ColMSB | ColLSN |    luminance    |
  +-------------------------------------+--------+--------+-----------------+
   31................................16  15...12   11...8   7.............0 

  unpacks to

 32       28        24        20       16       12        8        4        0
  +--------^---+-----^----+----^-----+--^------+--^-------^--------^--------+
  | 0 0 0 0 0 0|  ColMSB  | 0 0 0 0 0|  minor  | 0 0 0 0 0|    luminance    |
  +------------+----------+----------+---------+----------+-----------------+
   31.......26   25....22   21....17   16...13   12.....8   7.............0 

   See PACK for some usage info for this instruction.
   Look into CrY for some general info about
   CrY pixels.
   


XOR   Rm,Rn:
~~~~~~~~~~~~
   16     10     5     0
   +------+-----+-----+
   |001011|mmmmm|nnnnn|
   +------+-----+-----+
     (11)

   Rn ^= Rm
   z   = Rn == 0
   n   = Rn < 0



Nat! (nat@zumdick.ruhr.de)
Klaus (kp@eegholm.dk)

$Id: table.html,v 1.9 1997/11/16 18:14:42 nat Exp $