3 December 2024

Chapter 4

by Arpon Sarker

The Elements of Computing Systems Chapter 4 - Machine Language

Introduction

A computer can be described abstractly from its machine language capabilities. A machine language is an agreed-upon formalism, designed to code low-level programs as a series of machine instructions (which manipulates memory using a processor and a set of registers). The goals of this language is direct execution and total control over the hardware platform as opposed to the generality and expressiveness of high-level languages.

General Machine Language

The term memory refers to the collection of hardware devices that store data and instructions in the computer. A contiguous array of cells of some fixed width called words/locations have unique addresses each. The processor refers to the CPU which is capable of arithmetic and logical operations, memory access operations, and control operations (branching). The outputs are either stored in registers of memory locations. Registers allow the processor to select and manipulate values from high-speed local memory, reducing the slow operations of memory access. For a 16-bit machine language, the machine language’s instruction set tells us what the 16 bits means and assembly languages allow us to read/generate these instructions using mnemonics. They are capable of arithmetic and logic operations, memory access using direct (loading the address), immediate (loading the literal value), and indirect (loading a memory location that holds the required address - useful for pointers). Branching use useful for repetition, conditional execution, and subroutine calling (jump to first command of code segment).

Hack Machine Language

The Hack computer is a 16-bit von Neumman platform, which consists of a CPU, two separate memory modules for instruction memory (pre-burned onto ROM chip) and data memory, and two memory-mapped I/O devices: screen and keyboard. For memory address spaces, both memories are 16-bit wide with a 15-bit address space, meaning there are 32K (2^15) registers that hold $2^16$ values (16-bit words).

There are two 16-bit registers D and A where D is for only data values and A is for data and address values. Since Hack instructions are 16 bits but the address is 15 bits, the address goes to the A register for the instruction to act on rather than having the address value as an operand inside the instruction. This is also the same for jump commands where the jump goes to the address specified in the A register. Note: each instruction is done in one clock cycle. The A-instruction sets the A register to a 15-bit value where the MSB is just set to 0 and in assembly this is just @val. The C-instruction is just 111 followed by an a bit for whether the D register interacts with the A or M register, c1..c6 which is the opcode or computation specification, d1..d3 which specifies where to store the output of the computation (D, A, or M[A] or any combination of these), and j1..j3 which is the specific jump operation used. To prevent conflicting uses of the A register where we don’t want to jump to a memory location holding a data value, we ensure that C-instructions that may cause a jump should not contain a reference to M.

$dest=comp;jump$ can be $D;jeq$ where jeq tells us whether D is 0 and D the comp section is what the ALU spits out so we can conditionally check something about it. $0;jmp$ is an unconditional jump as the ALU outputs 0 which is arbitrary and just jumps to whatever address the A register is storing.

Assembly’s symbols can include predefined symbols, virtual registers(R0..R15), predefined pointers(sp, lcl, etc.) which refer to ram address 0 to 4 which is quite redundant as they can be selected with R0 to R4 but is useful for implementing the virtual machine, I/O pointers (SCREEN and KBD), label symbols which serve as a destination for jump commands, and variable symbols.

Binary code files consist of 16 0s and 1s in ASCII for each line. Once loaded, the file’s nth line is stored in address n of instruction memory where the count of both program lines and instruction memory start at 0.

I/O Handling: The Hack Computer includes a black-and-white screen with 256 rows of 512 pixels per row. The screen’s contents are represented by an 8K memory map that starts at RAM address of 0x4000. Each row starting at the top left corner is represented in the RAM by 32 consecutive 16-bit words. Thus at row r and column c, the corresponding word is located at $RAM[16384 + r\cdot 32 + c/16]$.

Why a stack of 32 consecutive 16-bit words?

One row has 512 pixels so $32\cdot 16 = 512$. I found this to be quite hard to grasp first because I was thinking in terms of possibilities which also works since $2^{512} = 2^{16 \cdot 32}$.

Why an 8K Memory Map?

We want to be able to hold $256 \cdot 512 = 131,072$ pixels and the minimum amount of memory needed to do this is with an 8K memory map. For instance, $131,072 \text{bits } \div 8 \text{ bits/byte } = 16,384 \text{ bytes}$ and $16,384 \text{ bytes } \div 2 \text { bytes/word } = 8,192 \text { words}$ and hence, we need at minimum an 8K memory map.

tags: compsci - computer_architecture - programming

Hacker theme

Hacker is a theme for GitHub Pages.