4.1 Central Processing Unit (CPU) Architecture
Von Neumann Model and Stored Program Concept
Von Neumann Architecture
Definition: A computer architecture proposed by John von Neumann in 1945 where instructions and data are stored in the same memory and accessed via the same bus.
Key Characteristics:
- Single memory space: Both program instructions and data stored in the same memory
- Single bus system: One set of address/data/control buses for both instructions and data (von Neumann bottleneck)
- Sequential execution: Instructions fetched and executed one at a time in sequence (unless branching)
- Stored program concept: Program instructions are stored in memory like data and can be modified
Stored Program Concept
Definition: The idea that both program instructions and data are stored in memory and the computer can manipulate them similarly.
Key Principles:
- Instructions as data: Program instructions are represented in binary and stored in memory just like data
- Fetch-Execute cycle: CPU fetches instructions from memory, decodes them, and executes them
- Programmability: Programs can be loaded into memory, modified, and executed
- Self-modifying code: Programs can modify their own instructions (rarely done today due to security)
Importance:
- Enables general-purpose computing
- Computers can run different programs without hardware changes
- Operating systems can load and manage multiple programs
- Forms the basis of virtually all modern computers
Von Neumann Bottleneck
Definition: The limitation on throughput caused by the single bus between CPU and memory.
Problem:
- Both instructions and data use same bus
- CPU often waits for data while fetching instructions, or vice versa
- Memory access speed limits overall performance
Solutions:
- Cache memory (stores frequently used data closer to CPU)
- Harvard architecture (separate instruction and data buses – used in some embedded systems)
- Wider buses
- Faster memory technologies
Basic Von Neumann Block Diagram
┌─────────────────┐
│ │
│ CPU │
│ │
│ ┌───────────┐ │
│ │ CU │ │
│ └───────────┘ │
│ ┌───────────┐ │
│ │ ALU │ │
│ └───────────┘ │
│ ┌───────────┐ │
│ │ Registers │ │
│ └───────────┘ │
└────────┬────────┘
│
┌──────────────┼──────────────┐
│ │ │
Address Bus Data Bus Control Bus
│ │ │
↓ ↓ ↓
┌─────────────────┐
│ │
│ Memory │
│ (Instructions │
│ and Data) │
└─────────────────┘
CPU Registers
General Purpose vs Special Purpose Registers
General Purpose Registers:
- Can be used for any temporary storage by programmer/compiler
- Hold intermediate results, variables, addresses
- Number varies by architecture (typically 8-32)
- Examples: R0, R1, R2… in ARM; EAX, EBX, ECX in x86
Special Purpose Registers:
- Designed for specific functions
- Used by CPU automatically during fetch-execute cycle
- Programmer may access some indirectly
- Critical for CPU operation
Special Purpose Registers
Program Counter (PC)
Purpose: Holds the memory address of the next instruction to be executed.
Operation:
- During fetch, contents copied to MAR
- After instruction fetch, PC increments to point to next instruction
- For jumps/branches, PC loaded with new address
- Size determines maximum addressable memory (e.g., 32-bit PC can address up to 4GB)
Register Transfer Notation: PC ← address of next instruction
Memory Address Register (MAR)
Purpose: Holds the address of memory location to be read from or written to.
Operation:
- Connected directly to address bus
- Before memory access, address placed here
- Used for both instruction fetches and data accesses
Register Transfer Notation: MAR ← address
Memory Data Register (MDR) / Memory Buffer Register (MBR)
Purpose: Holds data being transferred to or from memory.
Operation:
- For read: Receives data from memory via data bus
- For write: Holds data to be sent to memory
- Acts as buffer between CPU and memory
Register Transfer Notation: MDR ← data from memory OR MDR ← data from CPU
Accumulator (ACC)
Purpose: Stores results of arithmetic and logic operations.
Operation:
- Default destination for ALU operations
- One operand often comes from ACC
- Results placed back in ACC
- In simple CPUs, the main working register
Register Transfer Notation: ACC ← result of ALU operation
Index Register (IX)
Purpose: Used for indexed addressing modes; holds offset for array access.
Operation:
- Contents added to base address to form effective address
- Useful for accessing arrays and data structures
- Can be auto-incremented/decremented for sequential access
Register Transfer Notation: Effective Address = base address + IX
Current Instruction Register (CIR)
Purpose: Holds the current instruction being executed.
Operation:
- During fetch, instruction copied from MDR to CIR
- Held here while being decoded and executed
- Contains opcode and operand fields
Structure:
CIR = [Opcode][Operand/Address]
↑ ↑
(to CU) (to MAR or ALU)
Status Register / Flags Register / Condition Code Register
Purpose: Stores status information about the result of the last operation.
Common Flags:
| Flag | Name | Description |
|---|---|---|
| Z | Zero Flag | Set if result was zero |
| N | Negative Flag | Set if result was negative |
| C | Carry Flag | Set if operation produced a carry out |
| V | Overflow Flag | Set if arithmetic overflow occurred |
| I | Interrupt Flag | Enables/disables interrupts |
Operation:
- Updated automatically by ALU after each operation
- Used by conditional branch instructions
- Example: “Branch if equal” checks Z flag
CPU Components
Arithmetic and Logic Unit (ALU)
Purpose: Performs all arithmetic and logical operations.
Operations performed:
| Category | Examples |
|---|---|
| Arithmetic | ADD, SUBTRACT, MULTIPLY, DIVIDE, INCREMENT, DECREMENT |
| Logical | AND, OR, NOT, XOR |
| Comparison | COMPARE, TEST |
| Shift/Rotate | Logical shift, Arithmetic shift, Rotate |
Operation:
- Receives operands from registers (usually ACC and another register/memory)
- Performs operation specified by Control Unit
- Stores result in ACC (or other destination)
- Updates Status Register flags
Control Unit (CU)
Purpose: Coordinates all activities of the CPU.
Functions:
- Instruction decoding: Interprets opcode in CIR
- Control signal generation: Creates timing and control signals for all components
- Sequencing: Manages fetch-execute cycle
- Interrupt handling: Responds to interrupts
Operation:
- Receives instruction from CIR
- Decodes to determine required actions
- Generates appropriate control signals
- Sends signals to ALU, registers, buses, memory
System Clock
Purpose: Provides timing signals to synchronise CPU operations.
Characteristics:
- Generates regular pulses
- Clock speed measured in Hz (typically GHz)
- One clock cycle = time between two pulses
- Most instructions take multiple clock cycles
Clock speed factors:
- Higher speed = potentially more instructions per second
- Limited by physical constraints (heat, power, signal propagation)
Relationship to performance:
Performance ∝ Clock Speed × Instructions per Cycle (IPC)
Immediate Access Store (IAS)
Purpose: Primary memory that CPU can access directly (RAM).
Characteristics:
- Holds currently executing programs and their data
- Directly addressable by CPU
- Much faster than secondary storage
- Volatile (lost when power off)
- Organised as addressable locations (each holding a word/byte)
Relationship with CPU:
- CPU reads instructions from IAS via MAR/MDR
- CPU reads/writes data via MAR/MDR
- Performance limited by IAS speed (von Neumann bottleneck)
Buses
System Buses
Definition: Communication pathways connecting CPU components and memory.
Address Bus
Purpose: Carries memory addresses from CPU to memory/I/O.
Characteristics:
- Unidirectional: Address flows only from CPU to memory
- Width determines maximum addressable memory
- n-bit address bus can address up to 2ⁿ memory locations
- Example: 32-bit address bus → 4GB maximum
Operation:
CPU (MAR) --address--> Address Bus --> Memory
Data Bus
Purpose: Carries data between CPU, memory, and I/O.
Characteristics:
- Bidirectional: Data can flow both directions
- Width affects data transfer rate
- Wider bus = more bits transferred per cycle
- Example: 64-bit data bus transfers 8 bytes per cycle
Operation:
CPU <--data--> Data Bus <--data--> Memory
Control Bus
Purpose: Carries control signals between CPU and other components.
Characteristics:
- Bidirectional: Various signals in both directions
- Width varies by architecture (typically 8-16 lines)
- Each line has specific function
Common control signals:
| Signal | Direction | Purpose |
|---|---|---|
| Read | CPU → Memory | Request memory read |
| Write | CPU → Memory | Request memory write |
| Clock | CPU → All | Synchronisation |
| Reset | CPU → All | Reset all components |
| Interrupt Request | I/O → CPU | Device needs attention |
| Interrupt Acknowledge | CPU → I/O | Interrupt received |
| Bus Request | Device → CPU | Request bus access |
| Bus Grant | CPU → Device | Bus access granted |
Bus Interaction Example: Memory Read
1. CPU places address in MAR
2. CPU sets Read line on Control Bus
3. Address placed on Address Bus
4. Memory locates address
5. Memory places data on Data Bus
6. CPU reads data into MDR
7. CPU clears control signals
Performance Factors
Processor Type and Number of Cores
Single-core processor:
- Executes one instruction at a time
- Performance limited by clock speed and efficiency
- Simple to program for
Multi-core processor:
- Multiple processing units on one chip
- Can execute multiple instructions simultaneously
- Requires parallel programming for full benefit
- Different types:
- Dual-core: 2 cores
- Quad-core: 4 cores
- Octa-core: 8 cores
- Many-core: 16+ cores
Performance considerations:
- Not all tasks can be parallelised
- Overhead of coordinating cores
- Amdahl’s Law: Speedup limited by sequential portion
- Power/heat management
Amdahl’s Law:
Speedup = 1 / ((1-P) + P/N)
where P = parallel portion, N = number of cores
Bus Width
Address bus width:
- Determines maximum addressable memory
- Wider = more RAM supported
- Example: 32-bit → 4GB, 64-bit → 16EB
Data bus width:
- Determines data transfer per cycle
- Wider = faster data transfer
- Common widths: 8, 16, 32, 64 bits
- Must match word size for optimal performance
Impact on performance:
- Narrow bus = more cycles to transfer data
- Bottleneck if bus slower than CPU
- Modern CPUs use 64-bit data buses
Clock Speed
Definition: Number of clock cycles per second (Hz).
Common speeds:
- Older CPUs: MHz (millions of cycles/second)
- Modern CPUs: GHz (billions of cycles/second)
- Typical: 2-5 GHz
Relationship to performance:
- Higher clock = potentially more instructions/second
- But not all instructions complete in one cycle
- Different instructions take different cycles (CPI)
CPI (Cycles Per Instruction):
CPU Time = Instructions × CPI × Clock Cycle Time
Limitations:
- Heat generation increases with clock speed
- Power consumption increases (often exponentially)
- Physical limits to how fast signals can travel
- Diminishing returns (pipelining better than raw speed)
Cache Memory
Definition: Small, fast memory between CPU and main memory.
Levels of cache:
| Level | Location | Size | Speed | Purpose |
|---|---|---|---|---|
| L1 Cache | Inside CPU core | 16-64KB | Fastest (1-2 cycles) | Instructions & data |
| L2 Cache | Between CPU and RAM | 256KB-1MB | Fast (3-10 cycles) | Data from RAM |
| L3 Cache | Shared between cores | 2-32MB | Moderate (10-30 cycles) | Shared data |
How cache works:
- CPU checks cache first for required data
- If present (cache hit) → fast access
- If not present (cache miss) → fetch from slower RAM
- Frequently used data kept in cache
Cache principles:
- Temporal locality: Recently accessed data likely to be accessed again
- Spatial locality: Data near recently accessed data likely to be accessed
Cache performance:
Average Access Time = Hit Time + (Miss Rate × Miss Penalty)
Impact:
- Good cache design significantly improves performance
- Larger cache = higher hit rate but slower access
- Multi-level cache balances speed and size
Ports for Peripheral Devices
Universal Serial Bus (USB)
Purpose: Universal connection standard for peripherals.
Characteristics:
- Hot-swappable (connect/disconnect without power off)
- Plug and play (automatic driver installation)
- Power delivery (up to 100W with USB-C)
- Data transfer + power in one cable
USB versions:
| Version | Speed | Year | Notes |
|---|---|---|---|
| USB 1.1 | 12 Mbps | 1998 | Low Speed/Full Speed |
| USB 2.0 | 480 Mbps | 2000 | Hi-Speed |
| USB 3.0 | 5 Gbps | 2008 | SuperSpeed (blue ports) |
| USB 3.1 | 10 Gbps | 2013 | SuperSpeed+ |
| USB 3.2 | 20 Gbps | 2017 | Dual-lane |
| USB4 | 40 Gbps | 2019 | Based on Thunderbolt 3 |
Connector types:
- USB-A (traditional rectangular)
- USB-B (printer/square)
- USB-C (reversible, modern)
- Micro-USB (older phones/devices)
- Mini-USB (older devices)
High Definition Multimedia Interface (HDMI)
Purpose: Digital video and audio interface.
Characteristics:
- Transmits uncompressed video and compressed/uncompressed audio
- Single cable for both video and audio
- HDCP copy protection support
- Consumer electronics standard
HDMI versions:
| Version | Max Resolution | Features |
|---|---|---|
| HDMI 1.4 | 4K @ 30Hz | Ethernet channel, 3D |
| HDMI 2.0 | 4K @ 60Hz | 32 audio channels |
| HDMI 2.1 | 8K @ 60Hz, 4K @ 120Hz | Dynamic HDR, eARC |
Connector types:
- Type A (Standard) – TVs, monitors
- Type C (Mini) – Cameras, tablets
- Type D (Micro) – Smartphones
Video Graphics Array (VGA)
Purpose: Analogue video connector (legacy standard).
Characteristics:
- Analogue signal (unlike digital HDMI/DVI)
- 15-pin DE-15 connector
- Susceptible to interference and signal degradation
- Being phased out in favour of digital connections
- No audio support
Limitations:
- Lower maximum resolution than digital
- Analogue signal conversion quality issues
- Bulky connector
- No hot-plug support (theoretically)
Common uses:
- Legacy monitors and projectors
- Older computer systems
- Backup connector on some equipment
Port Comparison
| Feature | USB | HDMI | VGA |
|---|---|---|---|
| Signal type | Digital | Digital | Analogue |
| Audio support | No (separate) | Yes | No |
| Video quality | N/A | Excellent | Good (limited) |
| Hot-pluggable | Yes | Yes | No |
| Max resolution | N/A | 8K+ | 2048×1536 |
| Power delivery | Yes (USB-C) | No | No |
| Primary use | Peripherals | Video/audio | Legacy video |
Fetch-Execute Cycle
Basic Stages
The fetch-execute cycle is the fundamental operation of a CPU, repeated for each instruction.
Stage 1: Fetch
Purpose: Get the next instruction from memory.
Steps:
- MAR ← PC (Address of next instruction sent to memory)
- Control bus signals memory read
- Memory places instruction on data bus
- MDR ← instruction from data bus
- CIR ← MDR (Instruction transferred to Current Instruction Register)
- PC ← PC + 1 (Increment to point to next instruction)
Register Transfer Notation:
MAR ← PC
[Control bus: Read]
MDR ← [[MAR]]
CIR ← MDR
PC ← PC + 1
Stage 2: Decode
Purpose: Interpret the instruction to determine what needs to be done.
Steps:
- Control Unit examines opcode in CIR
- Identifies instruction type (ADD, LOAD, etc.)
- Identifies addressing mode
- Determines what operands are needed
- Generates control signals for execute stage
Register Transfer Notation:
[Control Unit decodes CIR]
[Control signals generated]
Stage 3: Execute
Purpose: Perform the actual operation.
Steps (vary by instruction type):
For ADD instruction (direct addressing):
- MAR ← address from CIR operand
- MDR ← [[MAR]] (fetch operand from memory)
- ACC ← ACC + MDR (ALU performs addition)
For LOAD instruction (immediate):
- ACC ← operand from CIR
For STORE instruction:
- MAR ← address from CIR
- MDR ← ACC
- Write to memory
Register Transfer Notation (example ADD):
MAR ← [address part of CIR]
MDR ← [[MAR]]
ACC ← ACC + MDR
Complete Cycle Example
Instruction: ADD 100 (Add contents of memory address 100 to ACC)
Initial state:
- PC = 200 (instruction at address 200)
- ACC = 5
- Memory[100] = 3
- Memory[200] = ADD 100
Fetch:
MAR ← PC (200)
MDR ← [[200]] (ADD 100)
CIR ← MDR (ADD 100)
PC ← PC + 1 (201)
Decode:
CU decodes opcode = ADD
Operand = 100 (direct addressing)
Control signals for ADD
Execute:
MAR ← 100 (from CIR)
MDR ← [[100]] (3)
ACC ← ACC + MDR (5 + 3 = 8)
Result: ACC = 8
Interrupts
Purpose of Interrupts
Definition: A signal to the CPU that an event needs immediate attention.
Why interrupts are needed:
- Allow CPU to respond to urgent events
- Enable multitasking
- Handle I/O efficiently (without polling)
- Manage errors and exceptions
- Support real-time processing
Possible Causes of Interrupts
| Type | Examples | Description |
|---|---|---|
| Hardware I/O | Keyboard press, mouse move, disk ready | External device needs attention |
| Timer | Time slice expired, real-time clock | Scheduled event |
| Program error | Division by zero, invalid address | Software error |
| Hardware failure | Power failure, memory error | System problem |
| Software interrupt | System call, breakpoint | Program requests OS service |
Applications of Interrupts
Multitasking:
- Timer interrupt switches between processes
- Each process gets time slice (timeslicing)
I/O handling:
- CPU starts I/O operation
- Device works independently
- Device interrupts when done
- CPU handles result without wasting cycles polling
Real-time systems:
- Guaranteed response to critical events
- Priority-based interrupt handling
Error handling:
- Trap errors immediately
- Prevent system crash
- Log diagnostic information
Interrupt Detection During Fetch-Execute Cycle
When CPU checks for interrupts:
- At the end of each fetch-execute cycle
- Between instruction execution
- Before fetching next instruction
Process:
- Complete current instruction
- Check interrupt line on control bus
- If interrupt pending, handle it
- Otherwise, fetch next instruction
Interrupt Handling Process
Step 1: Interrupt occurs
- Device sends signal on interrupt request line
- CPU detects interrupt at cycle end
Step 2: CPU saves current state
- Completes current instruction (if possible)
- Saves PC and status register to stack (or known location)
- This saved state allows resumption later
Step 3: Identify interrupt source
- CPU checks interrupt vector table
- Each device has unique interrupt number
- Vector table contains addresses of ISRs
Step 4: Load Interrupt Service Routine (ISR) address
- PC ← address of appropriate ISR
- This is a jump to interrupt handler
Step 5: Execute ISR
- CPU runs special program to handle the interrupt
- May disable other interrupts (or allow prioritised ones)
- Communicates with interrupting device
- Performs necessary processing
Step 6: Return from interrupt
- ISR executes special return instruction (IRET)
- CPU restores saved state (PC, status register)
- Execution resumes where it left off
Interrupt Service Routine (ISR)
Definition: Special program that handles a specific interrupt type.
ISR characteristics:
- Short and fast (should not block other interrupts)
- Saves any registers it uses
- Re-enables interrupts when safe
- Ends with return-from-interrupt instruction
Multiple Interrupts
Priority handling:
- Some interrupts more important than others
- Higher priority can interrupt lower priority ISRs
- Example priorities: Power failure > hardware error > I/O > timer
Nested interrupts:
- Interrupt A (low priority) starts
- Interrupt B (high priority) occurs
- CPU saves A’s state
- CPU handles B completely
- CPU returns to A
Interrupt Vectors and Vector Table
Interrupt Vector Table:
- Array of ISR addresses
- Located in low memory
- Indexed by interrupt number
Address Contents
0000 ISR0 address (division by zero)
0004 ISR1 address (timer)
0008 ISR2 address (keyboard)
...
Interrupts vs Polling
| Aspect | Interrupts | Polling |
|---|---|---|
| CPU usage | Efficient (only act when needed) | Wasted (constant checking) |
| Response time | Fast (immediate attention) | Variable (depends on poll frequency) |
| Complexity | More complex (hardware + software) | Simple |
| Multiple devices | Easy (priorities, vectors) | Must poll each in turn |
| Real-time | Good (predictable response) | Poor (worst-case long) |
4.2 Assembly Language
Assembly Language vs Machine Code
Machine Code
- Binary representation of instructions
- Directly executable by CPU
- Difficult for humans to read/write
- Architecture-specific
Example: 10110000 01100001 (may mean “load 97 into AL”)
Assembly Language
- Human-readable mnemonics for machine instructions
- One-to-one mapping with machine code typically
- Requires assembler to convert to machine code
- Still architecture-specific
Example: MOV AL, 61h (same instruction)
Relationship
Assembly → [Assembler] → Machine Code → [CPU] → Execution
MOV AL, 61h 10110000 01100001 Loads 97 into AL
Advantages of Assembly over Machine Code:
- Easier to remember mnemonics (MOV vs 10110000)
- Labels for memory addresses
- Comments can be added
- Less error-prone
Advantages of Assembly over High-Level Languages:
- Direct hardware control
- More efficient (if written well)
- Access to special instructions
- Required for some system programming
Two-Pass Assembler
Why Two Passes?
Problem: Forward references (using a label before it’s defined)
Example:
JMP DONE ; Jump to DONE (not defined yet)
...
DONE: MOV AL, 0 ; DONE defined here
The assembler doesn’t know the address of DONE when processing the JMP instruction.
Pass 1: Build Symbol Table
Purpose: Assign addresses to all labels.
Process:
- Initialize location counter to 0 (or start address)
- Read each line of source code
- If line has a label, add to symbol table with current location counter value
- Calculate instruction size and add to location counter
- Ignore operands (only need to know label definitions)
Example:
| Line | Label | Opcode | Operand | Location Counter | Symbol Table |
|---|---|---|---|---|---|
| 1 | START | LDM | #5 | 0 | START=0 |
| 2 | ADD | COUNT | 1 | ||
| 3 | LOOP | DEC | ACC | 3 | LOOP=3 |
| 4 | JPN | DONE | 4 | ||
| 5 | JMP | LOOP | 5 | ||
| 6 | DONE | OUT | 6 | DONE=6 | |
| 7 | END | 7 |
Pass 2: Generate Machine Code
Purpose: Generate actual machine code using symbol table.
Process:
- Reset location counter to 0
- Read each line again
- Look up any label operands in symbol table
- Convert mnemonics to opcodes
- Generate machine code
- Output object code
Example (continuing from above):
| Label | Opcode | Operand | Machine Code | Notes |
|---|---|---|---|---|
| START | LDM | #5 | 00000101 | Immediate value 5 |
| ADD | COUNT | 00100010 | COUNT address known | |
| LOOP | DEC | ACC | 10000001 | |
| JPN | DONE | 01100110 | DONE=6 resolved | |
| JMP | LOOP | 01100011 | LOOP=3 resolved | |
| DONE | OUT | 11110000 | ||
| END | No code generated |
Symbol Table Example
| Label | Address |
|---|---|
| START | 0 |
| LOOP | 3 |
| DONE | 6 |
Assembly Language Instruction Groups
Data Movement Instructions
Purpose: Move data between registers, memory, and immediate values.
| Instruction | Example | Description |
|---|---|---|
| LDM | LDM #5 | Load immediate 5 to ACC |
| LDD | LDD 100 | Load from memory address 100 |
| LDI | LDI 200 | Indirect: Use address at 200 to find data |
| LDX | LDX 300 | Indexed: 300 + IX = effective address |
| LDR | LDR #10 | Load immediate 10 to Index Register |
| MOV | MOV IX | Move ACC to IX |
| STO | STO 400 | Store ACC to address 400 |
Input/Output Instructions
Purpose: Communicate with peripheral devices.
| Instruction | Example | Description |
|---|---|---|
| IN | IN | Read character from keyboard to ACC |
| OUT | OUT | Output character in ACC to screen |
Arithmetic Operations
Purpose: Perform mathematical calculations.
| Instruction | Example | Description |
|---|---|---|
| ADD | ADD 100 | Add memory[100] to ACC |
| ADD #n | ADD #5 | Add immediate 5 to ACC |
| SUB | SUB 100 | Subtract memory[100] from ACC |
| SUB #n | SUB #5 | Subtract immediate 5 from ACC |
| INC | INC ACC | Increment ACC by 1 |
| DEC | DEC IX | Decrement IX by 1 |
Compare Instructions
Purpose: Compare values and set flags for conditional jumps.
| Instruction | Example | Description |
|---|---|---|
| CMP | CMP 100 | Compare ACC with memory[100] |
| CMP #n | CMP #5 | Compare ACC with immediate 5 |
| CMI | CMI 200 | Indirect: Use address at 200 to find compare value |
Unconditional Jump
Purpose: Always transfer control to another address.
| Instruction | Example | Description |
|---|---|---|
| JMP | JMP 500 | Jump to address 500 |
Conditional Jump Instructions
Purpose: Transfer control only if previous compare was true/false.
| Instruction | Example | Description |
|---|---|---|
| JPE | JPE 500 | Jump if compare was equal/true |
| JPN | JPN 500 | Jump if compare was not equal/false |
Other Instructions
| Instruction | Example | Description |
|---|---|---|
| END | END | Return control to OS |
Addressing Modes
Immediate Addressing
Definition: Operand is the actual value to be used.
Syntax: LDM #5
Operation: ACC ← 5
Advantages:
- No memory access (fast)
- Simple
Disadvantages:
- Constant only (cannot use variables)
- Limited by instruction size
Use case: Loading constants, initialising registers
Direct Addressing
Definition: Operand is the memory address where the data is located.
Syntax: LDD 100
Operation: ACC ← [100] (contents of address 100)
Advantages:
- Can access any memory location
- Simple addressing
Disadvantages:
- Address fixed at compile time
- Limited address range (if address field small)
Use case: Accessing global variables, fixed data structures
Indirect Addressing
Definition: Operand is the address of a memory location that contains the actual address of the data.
Syntax: LDI 200
Operation:
temp ← [200] (get address from 200)
ACC ← [temp] (get data from that address)
Advantages:
- Can implement pointers
- Dynamic addressing
Disadvantages:
- Two memory accesses (slower)
- More complex
Use case: Pointers, dynamic data structures, arrays
Indexed Addressing
Definition: Effective address = base address + contents of index register.
Syntax: LDX 300 (assume IX = 10)
Operation: Effective address = 300 + 10 = 310
ACC ← [310]
Advantages:
- Efficient for arrays
- Can access sequential memory locations by incrementing IX
Disadvantages:
- Requires index register management
- One extra calculation
Use case: Arrays, tables, buffers
Relative Addressing
Definition: Effective address = current PC + offset.
Syntax: (Often used for jumps) JMP 5 (relative to PC)
Operation: PC ← PC + 5
Advantages:
- Position-independent code
- Short offsets save space
Disadvantages:
- Limited range
- Must know current PC
Use case: Branches, jumps within same code section
Addressing Mode Comparison
| Mode | Effective Address | Memory Accesses | Flexibility |
|---|---|---|---|
| Immediate | N/A (value in instruction) | 0 | Low |
| Direct | Address in instruction | 1 | Medium |
| Indirect | Address from memory | 2 | High |
| Indexed | Base + IX | 1 | High |
| Relative | PC + offset | 0-1 | Medium |
Tracing Assembly Programs
Example Program 1: Simple Addition
LDM #5 ; Load 5 into ACC
STO 100 ; Store ACC at address 100
LDM #3 ; Load 3 into ACC
ADD 100 ; Add contents of address 100 to ACC
OUT ; Output result (should be 8)
END
Trace:
| Step | PC | ACC | IX | Memory[100] | Action |
|---|---|---|---|---|---|
| Start | 0 | ? | 0 | ? | Initial state |
| LDM #5 | 1 | 5 | 0 | ? | Load 5 to ACC |
| STO 100 | 2 | 5 | 0 | 5 | Store ACC to 100 |
| LDM #3 | 3 | 3 | 0 | 5 | Load 3 to ACC |
| ADD 100 | 4 | 8 | 0 | 5 | ACC ← 3 + 5 |
| OUT | 5 | 8 | 0 | 5 | Output ‘8’ |
| END | – | 8 | 0 | 5 | Program ends |
Example Program 2: Loop
LDM #0 ; Initialise counter to 0
STO 200 ; Store at address 200 (counter)
LOOP: LDD 200 ; Load counter
ADD #1 ; Increment
STO 200 ; Store back
CMP #5 ; Compare with 5
JPN LOOP ; If not 5, loop again
OUT ; Output final count (should be 5)
END
Partial Trace:
| Step | PC | ACC | Memory[200] | Action |
|---|---|---|---|---|
| 0 | ? | ? | Start | |
| LDM #0 | 1 | 0 | ? | |
| STO 200 | 2 | 0 | 0 | |
| LOOP: LDD 200 | 3 | 0 | 0 | |
| ADD #1 | 4 | 1 | 0 | |
| STO 200 | 5 | 1 | 1 | |
| CMP #5 | 6 | 1 | 1 | Compare (not equal) |
| JPN LOOP | 3 | 1 | 1 | Jump to LOOP |
| … repeats … | ||||
| (when ACC=5) | 5 | 5 | ||
| JPN LOOP | Not taken (equal) | |||
| OUT | 5 | 5 | Output ‘5’ |
4.3 Bit Manipulation
Binary Shifts
Logical Shifts
Logical Shift Left (LSL)
Operation: Shift bits left, zeros introduced on right.
Example: LSL #2 (shift left 2 places)
“`
Before: ACC = 00010111 (23)
After:
