Computer Architecture Lab/Winter2006/PolzerJahn/Assembler

From Wikiversity
Jump to navigation Jump to search

The SISP Assembler[edit | edit source]

The Assembler for the SIS Processor was written in Python and is using the SPARK framework by John Aycock.

The assembling is done in 4 stages:

  1. Lexical analysis (scanning): Checks if the input format is valid and breaks the input stream into a list of tokens
  2. Syntax analysis (parsing): Checks if the list of tokens has valid syntax according to a grammar
  3. Preprocessing: Finds the addresses of labels and replaces targets
  4. Code generation: Generates bytecode

Usage[edit | edit source]

To assemble the file test.asm, use the following command:

   python sispasm.py < test.asm > test.bin

Note: The assembler does not support comments or macros. If you use either of them (in C-style), use gcc with the command

   gcc -x c -E -P test.asm > test.asm.ao

before invoking the assembler. From Version 0.4 a simple shell script is included in the archive which does both steps consecutively.

Syntax[edit | edit source]

The parser stage uses the following grammar to verify a valid syntax:

   program ::= program instruction
   program ::= program label
   program ::= program command
   instruction ::= OP
   instruction ::= OP REG
   instruction ::= OP REG SEP REG
   instruction ::= OP REG SEP NUM
   instruction ::= OP NUM
   instruction ::= OP TGT
   label ::= LBL
   command ::= CMD
   command ::= CMD NUM

The assembler is case insensitive. There are three types of input: Instructions, Labels and commands.

Instructions[edit | edit source]

OP [REG [, REG | IMM]]

Instructions represent the commands that get executed in our processor. An instruction may have zero, one or two operands separated by a comma. Operand one acts both as source and destination register. The second operand can be another source register or an immediate value. For a list of available instructions please refer to the instruction set. Some example instructions:

   ADD r1, r2
   LDC r2, text
   LD  r1, r1
   INP r4, 2
   AND r4, r5
   JZ  loop

Labels[edit | edit source]

LABELNAME:

Labels can be used at any point of the program to name an address. They can then be used in Instructions (e.g. JMP, LDI) instead of hard coding the addresses/values. In the preprocessing stage of the assembler every occurrence of a label gets replaced by its assigned address. Label names can consist of letters, numbers and underlines but have to start with a letter or underline. Some Examples:

   TEXTLENGTH:
   MAIN_LOOP:
   ELSE_2:

Commands[edit | edit source]

.COMMANDNAME [PARAM]

There are currently three commands which get processed by the assembler:

  • .WORD <VALUE>

Inserts a 16-bit value at the position of this command. Values can be written in decimal or hexadecimal number format.

  • .DWORD <VALUE>

Inserts a 32-bit value at the position of this command. Values can be written in decimal or hexadecimal number format.

  • .END

Marks the end of the program. No commands or instructions after this command will get processed.

Some examples for how you might use commands:

   // The string "SISP!!"
   text: 
       .word 0x53
       .word 0x49
       .word 0x53
       .word 0x50
       .word 0x21
       .word 0x21
       .word 0x0A
       .word 0x0D

Registers[edit | edit source]

SISP provides 16 registers, 16 Bits wide. They are named R[0-15]. When our stack macros are used, R15 is reserved for the stack pointer.

Download[edit | edit source]

You can download the current version of our assembler as archive here.

Version history[edit | edit source]

0.7: 18-01-2007

  • Fixed a bug which caused commands having no effect

0.6: 13-01-2007

  • Fixed an error in the grammar that caused very long execution times
  • Made script independent from callers cwd
  • Made assembler case insensitive
  • Allowed underline character in labels
  • Added instructions (JMPR, LDIP)

0.5: 14-12-2006

  • Fixed enumeration error

0.4: 2-11-2006

  • Added operations ADC, SBB, NEG, ASL, ASR
  • Changed JE, JNE to JZ, JNZ
  • Added simple shell script for assembling
  • Removed minor bugs

0.3: 27-10-2006

  • labels can now be used as constants in LDC
  • changed operation numbers
  • added io operation type
  • added check for multiple label occurrence

0.2: 26-10-2006

  • added support for hexadecimal and octal numbers
  • added instruction format in optype description (makes the assembler more general and flexible)

0.1: initial version