## Assembly for Architecture

Lab Tonight! 7-8:15pm Edmunds 105



CTRL

RETURN TO CONTROL LOOP

JMP

C035 7E C0 AF HEXERR

Image Credit: Michael Holley, <a href="https://en.wikipedia.org/wiki/">https://en.wikipedia.org/wiki/</a>
<a href="https://en.wikipedia.org/wiki/">Assembly\_language</a>

Assembly program

for the Motorola

MC6800

microprocessor

(1974)

#### Outline

- Interpreting and constructing an assembly instruction
- Assembly instruction formats
- Assembly instruction classes
- Assembly language trade-offs

I apologize in advance, the first part of today is a lot for me speaking at you... There will be reinforcing exercises and please ask questions as they come up!

#### Memory

| 13     | 03     | c3     | fe     | 00     | 00     | 00     | ef     | be     | ad     | de     | ee     | ff     | c0     |
|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|--------|
| 0xff00 | 0xff01 | 0xff02 | 0xff03 | 0xff04 | 0xff05 | 0xff06 | 0xff07 | 0xff08 | 0xff09 | 0xff0a | 0xff0b | 0xff0c | 0xff0d |

What is 4-byte instruction at address Oxff00?

How should the processor interpret Oxfec30313?

### Interpreting an Instruction

- Instructions are composed of an operation segment (sometimes referred as "opcode")
- Remaining fields in the instruction are defined by the operation type
- Depending on the instruction set, the remaining segments (fields) may or may not follow strict formats
- Register definitions require a translation to a bytecode interpretation

#### Sample assembly for reference

add t0, t1, t2

addi x4, x4, 15

bne al, a2, Oxffff

#### Interpreting a RISC-V Instruction

RISC-V assembly instructions follow strict formats

|   | inst[4:2] | 000    | 001      | 010   | 011      | 100    | 101    | 110       |  |
|---|-----------|--------|----------|-------|----------|--------|--------|-----------|--|
|   | inst[6:5] |        |          |       |          |        |        |           |  |
|   | 00        | LOAD   | LOAD-FP  |       |          | OP-IMM |        | OP-IMM-32 |  |
|   | 01        | STORE  | STORE-FP | AMO   | MISC-MEM | OP     | LUI    | OP-32     |  |
| ř | 10        | MADD   | MSUB     | NMSUB | NMADD    | OP-FP  |        |           |  |
|   | 11        | BRANCH | J        | JALR  | JAL      |        | SYSTEM |           |  |



Table 2: RISC-V base opcode map, inst[1:0]=11

https://www2.eecs.berkeley.edu/Pubs/ TechRpts/2011/EECS-2011-62.pdf

#### Opcodes in Other ISAs

x86: <a href="http://ref.x86asm.net/coder32.html">http://ref.x86asm.net/coder32.html</a>

ARM: <a href="https://iitd-plos.github.io/col718/">https://iitd-plos.github.io/col718/</a>
<a href="ref/arm-instructionset.pdf">ref/arm-instructionset.pdf</a>

Power10: <a href="https://">https://</a>
<a href="mailto:https://">files.openpower.foundation/s/</a>
<a href="mailto:9izgC5Rogi5Ywmm">9izgC5Rogi5Ywmm</a>

#### **Assembly Instruction Formats**

- Depending on the instruction set, different instructions may have different lengths
- When a processor wants to "fetch" an instruction to execute, it needs to know how many bytes to fetch from memory
- The processor can *over approximate* the size of the instruction to be the largest possible instruction size, and then use the relevant bits after the it is interpreted

#### Assembly Instruction Formats (RISC-V Case Study)

https://www2.eecs.berkeley.edu/Pubs/ TechRpts/2011/EECS-2011-62.pdf



https://www2.eecs.berkeley.edu/Pubs/ TechRpts/2011/EECS-2015-157.pdf

#### Assembly Instruction Formats (RISC-V Case Study)

- Instructions in the standard format are 32 bits long
- Instructions in the compressed format are 16 bits long
- All standard instructions have Obll as the least significant bits of the instruction because that is the opcode field

### Chat with your neighbor(s)!

# What is the operation for the RISC-V instruction "0x3e820293"?

| inst[4:2] | 000    | 001      | 010   | 011      | 100    | 101    | 110       |
|-----------|--------|----------|-------|----------|--------|--------|-----------|
| inst[6:5] |        |          |       |          |        |        |           |
| 00        | LOAD   | LOAD-FP  |       |          | OP-IMM |        | OP-IMM-32 |
| 01        | STORE  | STORE-FP | AMO   | MISC-MEM | OP     | LUI    | OP-32     |
| 10        | MADD   | MSUB     | NMSUB | NMADD    | OP-FP  |        |           |
| 11        | BRANCH | J        | JALR  | JAL      |        | SYSTEM |           |

Table 2: RISC-V base opcode map, inst[1:0]=11

#### Chat with your neighbor(s)!

# Why do you think the opcode field comprises the LSBs in RISC-V? How is this related to endianness?

#### Instruction Class Formats

- Data transfers: loads and stores between memory and registers
- Control logic: conditional and unconditional jumps (support for if-statements, loops, function calls)
- Computations: add two variables together, subtract a constant, perform bitwise operations

#### Registers

- Small, fast memory (usually the size of a word, or default data size for a specific architecture)
- Used by the CPU
  - Usually addressed separately from main memory
  - Some have special purposes, some are general purpose (GPR)
- In RISC-V, all arithmetic and control computations are done on registers
  - To compute on memory, first load data from memory into register
  - Called a "register-register" architecture (other examples: ARM, MIPS)
  - Contrast with "register-memory" architecture (example: x86)

### Registers (continued...)

- In RV32I: there are 31 GPRs (x1-x32) and each register has a conventional role
- x0 is hardwired to 0
- pc (program counter): holds the address of the current instruction

| Register | ABI Name | Description                      | Saver  |
|----------|----------|----------------------------------|--------|
| x0       | zero     | Hard-wired zero                  |        |
| x1       | ra       | Return address                   | Caller |
| x2       | sp       | Stack pointer                    | Callee |
| x3       | gp       | Global pointer                   | _      |
| x4       | tp       | Thread pointer                   | _      |
| x5-7     | t0-2     | Temporaries                      | Caller |
| x8       | s0/fp    | Saved register/frame pointer     | Callee |
| x9       | s1       | Saved register                   | Callee |
| x10-11   | a0-1     | Function arguments/return values | Caller |
| x12-17   | a2-7     | Function arguments               | Caller |
| x18-27   | s2-11    | Saved registers                  | Callee |
| x28-31   | t3-6     | Temporaries                      | Caller |
| f0-7     | ft0-7    | FP temporaries                   | Caller |
| f8-9     | fs0-1    | FP saved registers               | Callee |
| f10-11   | fa0-1    | FP arguments/return values       | Caller |
| f12-17   | fa2-7    | FP arguments                     | Caller |
| f18-27   | fs2-11   | FP saved registers               | Callee |
| f28-31   | ft8-11   | FP temporaries                   | Caller |

Table 18.2: RISC-V calling convention register usage.

#### Chat with your neighbor(s)!

How many usable opcodes are in RISC-V assembly? What is advantageous about this encoding methodology? Problematic?

#### Takeaways

- Instructions are encoded with an "opcode" which tells a processor how to decode an instruction
- Different instruction types follow similar formats, and similar register definitions tend to appear in similar places
- Across instruction sets, various opcode conventions exist but in general decoding instructions generally starts by decoding opcodes