Computer Architecture Lab/Winter2006/HoeftPirkWeir/InstructionSetI

ARM

32-bit RISC Prozessor, der sich vor allem durch gute Energieeffizienz auszeichnet, weswegen er in einer Vielzahl von Embedded Systems seinen Einsatz findet.

Instruction Set

Im normalen Modus ist eine Instruktion 32-bit lang, es gibt aber auch den Thumb Modus der einen reduzierten 16-bit Befehlssatz aufweist. Es wird hier nur auf den 32-bit Basisbefehlssatz eingegangen.

Register

Es gibt 15 General-Purpose Register, von denen 2 als Stack bzw. Rückkehradressen genützt werden. Zur effizienteren Interrupt-Behandlung sind diese 2 Register für jeden Prozessormodus ein Mal vorhanden.

Bedingte Ausführung

Jede Instruction beinhaltet 4 Bit, die eine Ausführung abhängig vom "Program Status Register" erlauben.

Branches

Relative Sprünge um bis zu 32MB als auch absolute Sprünge zu einer Adresse in einem Register sind möglich. Es gibt auch jeweils eine Version bei der der aktualle PC ins Rückkehradressenregister geladen wird.

Datenmanipulation

Unterstützte Operationen sind logisches UND, (exklusives) ODER, Subtraktion (mit Carry), Addition (mit Carry), Invertierung.

Ein Eingangswert kommt aus dem Shifter, der geshiftete oder rotierte 8-bit immediate Werte oder Registerwerte zur Verfügung stellt. Die Anzahl der geshifteten Bits ist entweder ein direkter 5-bit Wert oder ein Registerwert. Der 2. Eingangswert (sofern vorhanden) ist immer ein Register. Das Ergebnis wird in einem Register abgelegt, für UND, exklusives ODER, Subtraktion und Addition gibt es eine Version, die nur das PSR dem Ergebnis entsprechend anpasst.

Weiters gibt es Multiply und Multiply-Accumulate Befehle, mit Varianten die 32-bit, 64-bit vorzeichenlose, 64-bit vorzeichenbehaftete Ergebnisse liefern.

Speicherzugriff

Die Adressierung erfolgt durch ein Basisregister und einen Offset. Der Offset kann ein direkter 12-bit Wert, ein Register oder ein geshiftetes Register sein. Weiters kann das Basisregister nach Zugriff durch den Wert von Basisregister+Offset ersetzt werden, um einfach über Speicherblöcke iterieren zu können. Es gibt außerdem Befehle die mehrere Worte auf einmal in beliebige Register laden/speichern.

Andere Befehle

Der Befehl SWP erlaubt Laden und Speichern eines Wertes im Speicher innerhalb einer Instruktion, wodurch Prozesse leichter synchronisiert werden können. Der Software Interrupt Befehl dient dem Zugriff von User Mode Programmen auf Priviledged Code. Ein Breakpoint Befehl ist ebenfalls vorhanden.

PowerPC 405

PowerPC 405 is a 32-bit implementation of the PowerPC embedded-environment architecutre providing up to 400 MHz and 608 DMIPS performance.

Key Features

32-bit architecture
Thirty-two 32-bit general purpose registers (GPRs), r0~r31
A number of 32-bit special purpose registers (SPRs)
- Most are accessed only by privileged software
- A few are accessed by all software
Flexible memory management
Enhanced debug capabilities
64-bit time base
3 timers: programmable interval timer, fixed interval timer, and watchdogtimer
5-stage pipeline with single-cycle execution of most instructions, including loads and stores
Support for unaligned loads and unaligned stores to cache arrays, main memory and on-chip memory
Minimized interrupt latency
Integrated instruction-cache and data-cache:
- 16 KB each, 2-way set associative
- 8 words (32 bytes) per cacheline
Support for both big-endian and little-endian addressing
4 GB of flat (non-segmented) address space
Dual-level interrupt

Important Registers

General Purpose Registers

32 GPRs (r0~r31) are the source and destination of all integer operations and are the source for address operands for all load/store operations. They also provide access to SPRs.

Floating Point Registers

FPRs here refer to FPRs and Floating-Point Status and Control Register (FPSCR). 32 FPRs (FPR0 - FPR31) are the source and destination operands of all floating-point operations and can contain 32-bit and 64-bit signed and unsigned integer values, as well as single and double-precision floating-point values. They also provide access to the FPSCR.

The FPSCR captures status and exceptions resulting from floating-point operations, and the FPSCR also provides control bits for enabling specific exception types, as well as for selecting one of the four rounding modes. Access to the FPSCR is through the FPRs.

Embedded microprocessors are frequently implemented without direct hardware support for the PPC floating-point instruction set, or only provide an interface to attach floating-point hardware. Many applications have little or no need for floating-point arithmetic, and software emulation of PPC floating-point instruction execution is usually more than adequate. The chip area and power savings of not implementing floating-point in hardware can be critical.

Special Purpose Registers

SPRs give status and control of resources within the processor core.

Instruction Address Register (IAR): known to programmers as the program counter or instruction pointer.
Link Register (LR): holds the address to return to at the end of a function call.
Fixed-Point Exception Register (XER): contains carry and overflow information from integer arithmetic operations, as well as carry input to certain integer arithmetic operations and the number of bytes to transfer during load and store string instructions, lswx and stswx.
Count Register (CTR): holds a loop counter that is decremented on certain branch operations.
Condition Register (CR): contains 8 fields, where each field is 4 bits. When an instruction's Rc bit (bit 31) is 1:
- In integer operations, the CR field 0 is set to reflect the result of operation result: Equal, Greater Than, Less Than, and Summary Overflow.
- In floating-point operations, the CR field 1 is set to reflect the state of the exception status bits in the FPSCR.

Any CR field can be the target of an integer or floating-point comparison instruction. The CR field 0 is also set to reflect the result of a conditional store instruction (stwcx or stdcx). Certain instructions can manipulate the CR.

Processor Version Register (PVR): a 32-bit read-only register that identifies the version and revision level of the processor.

Instruction Set

The POWER architecture has over two hundred defined instructions. Most instructions execute in a single cycle and typically perform a single operation (such as loading storage to a register, or storing a register to memory). Similar to most of the other 32-bit RISC ISAs, all PPC instructions are four bytes long and are word aligned. Bits 0:5 contain the primary opcode. Some instruction forms define an extended opcode field for specifying additional instructions.

Four Primary Formats

Register-register

Op⁶

Rd⁵

Rs1⁵

Rs2⁵

Opx¹¹

Register-immediate

Op⁶

Rd⁵

Rs1⁵

Const¹⁶

Branch

Op⁶

Opx⁵

Rs1⁵

Const¹⁴

Opx²

Jump/call

Op⁶

Const²⁴

Opx²

Instruction Examples

Integer arithmetic instructions

add, add carrying - addc, add extended - adde, add immediate carrying - addic, add to minus one extended - addme
devide word - divw, devide word unsigned -divwu
multiply low immediate - mulli, multiply high word - mulhw, multiply cross halfword to word unsigned - mulchwu
subtract from - subf, subtract from extended - subfe, subtract from zero extended
negate - neg

Logical, rotate, and shift instructions

and, or, nor, xor, equivalent - eqv, and with complement - andc
count leading zeros doubleword - cntlzd, extend sign halfword - extsh
rotate left doubleword then clear - rldc, rotate left word then and with mask - rlwnm
shift right doubleword - srd, shift right word immediate - srwi

Floating-point and FPSCR manipulation instructions

FP move - fmr, FP negate - fneg, FP compare ordered - fcmpo, FP select - fsel
move to CR from FPSCR - mcrfs, move to FPSCR bit 1 - mtfsb1

Branch instructions

branch - b
branch conditional - bc
branch conditional to count register - bcctr
branch conditional to link register - bclr

Uniqueness

Link Register: PPC puts the return address into the link register instead of one of the GPRs. It makes the return jump faster since the hardware need not go through the register read pipeline stage for return jumps.
Counter Register: It is used in for loops of a fixed number of times, therefore the the branch hardware can quickly determine if a branch is likely to take place.
Register 0: r0 is not hardwired to the value 0. Although it can not be used as a base register, however in base+index addressing it can be used as the index.
Load multiple and store multiple save or restore up to 32 registers in a single instruction.
CBTLZ counts leading zeros.
Logical shifted immediate instructions shift the 16-bit immediate to the left 16 bits before performing AND, OR, or XOR.
Etc.

References

A developer's guide to the POWER architecture, 03/2004 [1]
IBM Product Overview - PowerPC 405 CPU Core, 09/2006 [2]
PowerPC Processor Reference Guide, 09/2003 [3]
CA:QA, 2nd Edition, J. Jennessy & D. Patterson

PICmicro

Is a family of microcontrollers made by Microchip Technology with a reduced instruction set (RISC). It is also known as simply PIC microcontrollers.

Features

80 memory ram positions, implemented as 8 bit registers. 12 special purpose registers and 68 general purpose registers.
11 bit instruction address
8 bit literals
8/16 bit Modified Harvard Architecture
Flash and ROM Memory options
I/O Ports
8/16 bit Timers
Several sleep modes
A/D converters
Serial peripherals
Voltage comparators
Internal EEPROM Memory
Motor control
USB, Ethernet, CAN, LIN, ... controller support

PICmicro MID-RANGE MCU FAMILY

Each midrange instruction is a 14-bit word divided into an OPCODE which specifies the instruction type and one or more operands which further specify the operation of the instruction. There are three basic categories:

Byte-oriented operations
Bit-oriented operations
Literal-oriented operations

All instructions are executed in one single instruction cycle, unless a conditional test is true or the program counter is changed as a result of an instruction.In these cases, the execution takes two instruction cycles with the second cycle executed as an NOP. Thus, for an oscillator frequency of 4 MHz, the normal instruction execution time is 1 us. If a conditional test is true or the program counter is changed as a result of an instruction, the instruction execution time is 2 us.

General instruction format

The possible formats for the instructions are listet in the figure.

the opcode portion of the instruction word varies from 3-bits to 6-bits of information. This is what allows the midrange instruction set to have 35 instructions.

Register manipulation

instruction set allows read and write of all file registers, including special function registers. If an instruction writes to the STATUS register, the Z, C, DC and OV bits may be set or cleared as a result of the instruction and overwrite the original data bits written. All bit manipulation instructions will first read the entire register, operate on the selected bit and then write the result back (read-modify-write (R-M-W)) the specified register.

Supported instructions

There are several groups of instrucions. Those can be divided into:

Copy value from/to file register or literal to/from w

Example: Mnemonic: movf fr, d Description: Move file register Function: fr => d

Logic / arithmetic instructions with a file register and w

Example: oprwf fr,d Mnemonic: logic / arithmetic operation with a file register and W Description: Function: fr opr W => d

Logic / arithmetic instructions with a literal and w

Example: Mnemonic: oprlw k Description: logic / arimetic operation with a literal and W Function: k opr W => W

One operand logic / arithmetic instructions

Example: Mnemonic: clrw Description: Clear accumulator W Function: 0 => W

Branch, Skip and Call instructions

Example: Mnemonic: goto addr Description: branch to addr Function: addr => PC(0:10)

Buit-in macros for commonly used logic / arithmetic operations

Example: Mnemonic: addcf fr, d Description: Add carry to fr Function: btfsc 3,0 incf f,d

References