Computer Architecture Lab/WS2007/OhHaHeTa/ThreeISAs

Intel i960

The i960 (aka. 80960) architecture originates from a joint venture from Intel and Siemens (1989-1990) to design a fast, fault-tolerant, object-oriented computer system. Due to problems finding a market for the project was abandoned. Later Intel built various versions of the CPU for (high-end) embedded applications, which came quite successful till about 2000.

The i960 is a RISC-based design with some special extensions not found in many other embedded processors (see #Other below). They dont share all of the instructions implemented in the original BiiN processor (i960MX). Some dont have a MMU or FPU or they lack several instructions (e.g. atmod), depending on the predetermined market.

Registers and Memory

32b/4GB physical address space
32b/4GB virtual address space for each process, which can be partitioned further into domains with enforced object-based protection.

Data in memory and in registers is stored in little endian (least significant byte at base address of an object/register), although there are big endian versions as well (i960CA and CF).

There are 16 general purpose 32b registers (g0...g15, g15 reserved for framepointer), 4 80b floating point registers (fp0...fp3) and 3 32b control registers (arithmetic (condition code etc.), process and trace). The processor provides a fresh set of 16 local registers (r0...r15) after every call without spilling the old register values to main memory (if possible). This method to increase performance is known as Register Windowing. The number of register sets depends on the implementation, it is 4 for the i960MX processor.

Data Types

signed and unsigned integer (8, 16, 32, 64 bits)
real (32, 64, 80 bits; if FPU present)
ASCII encoded decimal digits(!)
bits, bit-strings (consecutive bits) (in a register only)
byte strings (contiguous sequence of bytes (in memory only))
triple and quadwords (96 and 128 bits)
literals (0-31: 5b; +0.0, +1.0 in FPU instructions)

Register Addressing

Mode	Description	ASM syntax	Comment
Absolute	offset	(reg)
Register Indirect	abase	exp
Reg. Ind. w/ offset	abase + offset	exp(reg)
Reg. Ind. w/ index and displacement	abase + (index * scale) + disp.	exp(reg)[reg * scale]	scale: 1, 2, 4, 8, 16
Index w/ displacement	(index * scale) + displacement	exp[reg * scale]
IP with displacement	IP + displacement + 8	exp(IP)	used for IP-relative addresses

Other Integral Parts/Features

instruction cache (512B in BiiN/i960MX)

Pipelining is aided by:

Register Scoreboarding
write buffering

Instruction Set

There are 4 types of instruction encoding, although the MEM format exists in two variants. All instructions are word aligned (on 32b/4B boundaries) and all, except the second MEM variant are 4B long:

As you can see in the diagram all instructions can be easily distinguished by their opcode located in the highest byte, except for the REG instructions. Those are the majority of i960 instructions, which is the reason, why they need a second byte to be differentiated. They use values from registers (or literals (m1/2 is set then)) as operands only. Instructions in the COBR format are primarily the compare-and-branch instructions. Source_1 can be a literal or register, source_2 is a register. The displacement is used to jump to IP + 4*displ., when the branch is taken. CTRL instructions combine the branches, where only a address to jump is needed. MEMA and MEMB operations are distinguished by the 12th byte, where 0 encodes the MEMA format. They compute memory addresses and incorporate load, store and lda as well as some other instructions.

Data Movement

loading and storing bytes, shorts, (double-, triple-, quad-)words from/to memory with automatic sign and zero extending. certain register alignment rules need to be followed. real numbers need to be transfered to integer registers before loading/storing them.
moving (double-, triple-, quad-)words around in memory
special commands for all the above operations to be used with virtual addressing.
lda to load big constants immediately or from memory.

Arithmetic

add, subtract, multiply, divide, remainder with signed and unsigned integers.
add, subtract w/ carry with unsigned integers.
extended multiply and divide with unsigned longs
modulo with signed integer
shift left, right with signed and unsigned integers
rotate left with unsigned
shift right dividing integer (equivalent to dividing even for negative values)

Logical

and	A and B
notand	!A and B
xor	!(A==B)
or	A or B
nor	!(A or B)
xnor	A==B
not	!A
notor	!A or B
ornot	A or !B
nand	!(A and B)

Bit, Bit Field and Byte String

set, clear, toggle(notbit) a bit
chkbit sets the condition code according to the bit
alterbit sets the bit according to the condition code
find the most significant set or clear bit
extract converts a bit field into an unsigned integer (== shift + zero fill)
modify copies the masked contents of one register into another
movstr moves a byte string in memory (fast and nonoverwriting (if the locations overlap) mode).
fill copies an ordinal repeatedly into a byte string.
cmpstr checks if two strings are equal
scanbyte checks if any two corresponding bytes are equival.

Comparison

cmpi, cmpo compare two signed or unsigned integers
concmpi, concmpo similar two the instruction above, but checks condition code before comparing. can be used to optimize two-sided checks (A >= x >= B).
compare and in/decrement designed for check ins loops.
matches the condition code with several masks (see #conditional branches) and stores 0 or 1 in the destination register.

Branches

Most of the branch instructions specify the target IP with a signed displacement to be added to the current IP. Extended branch instructions specify a memory address which contains the target IP using one of the addressing modes described above.

unconditional branches

b, bx jump to specified IP bal, balx "branch and link", used for an alternative implemantation of procedure calls.

conditional branches

test the condition code and jumps iff it "matches" according to the instruction. following matches are possible: [not] equal, less [or equal], greater [or equal]

compare and branch

instructions test two (un)signed integers and branches only if they "match" (same matches as above).
bbs, bbc check bit and branch if it is set/clear.

Call/Return

Calls and return automatically save local registers and setup the stack frames.

FPU

Typical floating point operations like add, subtract, multiply, divide, remainder, square root, comparing and conversion to/from integer and to different floating-point types are supported in some incarnations of i960.

Other

Debugging

To aid debugging, exceptions can be triggered with explicit commands (mark, fmark) or enabling generic trace-faults per process, which fire for example when a return instruction is executed. Latter are controlled thru the trace-control word which can be modified with modtc. Another possibilty to trigger faults are comparing instructions like "fault if equal" etc.

Atomic

There are three instructions to do an atomic read-modify-write operation:

atadd: adds the operand to the value in memory
atmod: replaces the value at the destination with the source value where the mask bits are set.
atrep: like atmod but without a mask.

Process Management

The ISA supports...

hardware scheduling where a process can be added to dispatching queue with 32 priority levels and roundrobin decisions within one priority class (schedprcs)

saving the current state of a process to memory (saveprcs)

switching over to another process (resumprcs)

interprocess communication in form of semaphores (wait, condwait, signal) and message passing similar to (but simpler than) unix message queues (receive, condrec, send, sendserv)

Decimal

dmovt: moves a decimal from one register to another and checks if it really is a decimal

daddc, dsubc: add/subtract decimals w/ carry.

References

http://bitsavers.org/pdf/biin/BiiN_CPU_Architecture_Reference_Man_Jul88.pdf

Alpha

ARM