Skip to content

Latest commit

 

History

History
679 lines (501 loc) · 15.9 KB

File metadata and controls

679 lines (501 loc) · 15.9 KB

NES CPU: Ricoh 2A03 (6502 Core)

Document Version: 1.0.0 Last Updated: 2025-12-18


Table of Contents


Overview

The NES CPU is a Ricoh 2A03 (NTSC) or RP2A07 (PAL), a custom variant of the MOS Technology 6502 microprocessor. Key differences from the stock 6502:

  • Decimal mode disabled: The D flag exists but has no effect
  • Audio Processing Unit (APU) integrated on the same die
  • DMA controller for OAM sprite data transfers
  • Clock frequency: 1.789773 MHz (NTSC) or 1.662607 MHz (PAL)

Clock Derivation

NTSC Master Clock: 21.477272 MHz
CPU Clock = Master ÷ 12 = 1.789773 MHz (~559 ns per cycle)

PAL Master Clock: 26.601712 MHz
CPU Clock = Master ÷ 16 = 1.662607 MHz (~601 ns per cycle)

Technical Specifications

Specification Value
Architecture 8-bit, little-endian
Data Bus 8-bit
Address Bus 16-bit (64KB address space)
Clock Speed (NTSC) 1.789773 MHz
Clock Speed (PAL) 1.662607 MHz
Instruction Set 56 official opcodes
Addressing Modes 13 modes
Stack 256 bytes ($0100-$01FF), descending
Interrupts NMI, IRQ, BRK, RESET

Cycles per Instruction

Instructions take 2-7 cycles:

  • 2 cycles: Register operations (INX, DEY, TAX, etc.)
  • 3-4 cycles: Memory read/write (LDA, STA)
  • 5-6 cycles: Read-Modify-Write (INC, DEC, ASL, ROL)
  • 7 cycles: Interrupts (NMI, IRQ, BRK)

Page Crossing Penalty: +1 cycle for some indexed addressing modes when crossing a 256-byte page boundary.


Registers

The 6502 has 7 registers, all 8-bit except the Program Counter (16-bit):

Accumulator (A)

pub a: u8
  • Primary register for arithmetic and logic operations
  • Holds one operand and receives results
  • Used with all ALU operations (ADC, SBC, AND, OR, EOR, CMP)

Index Registers (X, Y)

pub x: u8
pub y: u8
  • Used for indexed addressing modes
  • Can be used as loop counters
  • X used for stack pointer offset (TSX, TXS)
  • Y typically used for indirect indexed addressing

Stack Pointer (SP)

pub sp: u8
  • Points to next free location on stack
  • Stack located at $0100-$01FF (256 bytes)
  • Descending (decrements on push, increments on pull)
  • Full address = $0100 + SP
  • Initialized to $FD on power-up

Program Counter (PC)

pub pc: u16
  • Points to next instruction to execute
  • Automatically incremented after instruction fetch
  • Modified by jumps, branches, subroutines, and interrupts
  • Initialized to value at $FFFC-$FFFD on RESET

Processor Status (P)

bitflags! {
    pub struct Status: u8 {
        const CARRY     = 0b0000_0001;  // C - Carry flag
        const ZERO      = 0b0000_0010;  // Z - Zero flag
        const INTERRUPT = 0b0000_0100;  // I - Interrupt disable
        const DECIMAL   = 0b0000_1000;  // D - Decimal mode (no effect on NES)
        const BREAK     = 0b0001_0000;  // B - Break command
        const UNUSED    = 0b0010_0000;  // - - Always 1
        const OVERFLOW  = 0b0100_0000;  // V - Overflow flag
        const NEGATIVE  = 0b1000_0000;  // N - Negative flag
    }
}

Flag Details

Carry (C):

  • Set when unsigned overflow occurs (addition) or no borrow needed (subtraction)
  • Used by ADC, SBC, CMP, CPX, CPY
  • Manipulated by SEC, CLC, ROL, ROR

Zero (Z):

  • Set when result equals zero
  • Affected by most ALU and load/store operations

Interrupt Disable (I):

  • When set, IRQ interrupts are masked (NMI still occurs)
  • Set by SEI, cleared by CLI
  • Automatically set during interrupt handling

Decimal (D):

  • On stock 6502, enables Binary-Coded Decimal (BCD) mode
  • Has no effect on NES - hardware is removed, but flag still exists
  • Set by SED, cleared by CLD

Break (B):

  • Distinguishes BRK from IRQ when status pushed to stack
  • Set (1) for BRK, clear (0) for IRQ/NMI
  • Not an actual flag register - created on stack push

Unused:

  • Always reads as 1
  • Pushed to stack as 1

Overflow (V):

  • Set when signed overflow occurs in addition/subtraction
  • Formula: (A^result) & (M^result) & 0x80
  • Used to detect when signed arithmetic result is incorrect
  • Cleared by CLV
  • Tested by BVC/BVS

Negative (N):

  • Copy of bit 7 of result
  • Used for signed comparisons
  • Affected by most ALU operations

Addressing Modes

The 6502 supports 13 addressing modes:

1. Implied

Operates on a register or flag, no operand needed.

Syntax: INX, DEY, CLC, SEI
Bytes: 1
Cycles: 2
Example: INX          ; X = X + 1

2. Accumulator

Operand is the accumulator.

Syntax: ASL A, ROL A, LSR A, ROR A
Bytes: 1
Cycles: 2
Example: ASL A        ; A = A << 1

3. Immediate

Operand is the next byte after opcode.

Syntax: LDA #$10
Bytes: 2
Cycles: 2
Effective Address: PC
Example: LDA #$42     ; A = 0x42

4. Zero Page

Operand at address $00XX (first 256 bytes).

Syntax: LDA $10
Bytes: 2
Cycles: 3
Effective Address: $00nn
Example: LDA $80      ; A = memory[$0080]

5. Zero Page,X

Zero page address + X register.

Syntax: LDA $10,X
Bytes: 2
Cycles: 4 (includes dummy read)
Effective Address: ($00nn + X) & $00FF
Example: LDA $80,X    ; A = memory[($80 + X) & $FF]

6. Zero Page,Y

Zero page address + Y register (used with LDX, STX).

Syntax: LDX $10,Y
Bytes: 2
Cycles: 4
Effective Address: ($00nn + Y) & $00FF

7. Absolute

Full 16-bit address.

Syntax: LDA $1234
Bytes: 3
Cycles: 4
Effective Address: $nnnn
Example: LDA $8000    ; A = memory[$8000]

8. Absolute,X

Absolute address + X register.

Syntax: LDA $1234,X
Bytes: 3
Cycles: 4 (+1 if page crossed)
Effective Address: $nnnn + X
Example: LDA $8000,X  ; A = memory[$8000 + X]

Page Crossing: If ($nnnn & $FF00) != (($nnnn + X) & $FF00), add 1 cycle (dummy read from wrong page).

9. Absolute,Y

Absolute address + Y register.

Syntax: LDA $1234,Y
Bytes: 3
Cycles: 4 (+1 if page crossed)
Effective Address: $nnnn + Y

10. Indirect

Used only with JMP. Address stored at given location.

Syntax: JMP ($1234)
Bytes: 3
Cycles: 5
Effective Address: memory[$nnnn] | (memory[$nnnn+1] << 8)
Example: JMP ($FFFC)  ; PC = memory[$FFFC] | (memory[$FFFD] << 8)

Hardware Bug: If low byte of address is $FF, high byte wraps within same page:

JMP ($10FF) reads:
  Low:  memory[$10FF]
  High: memory[$1000]  <- should be $1100!

11. Indexed Indirect (Indirect,X)

Address from zero page table indexed by X.

Syntax: LDA ($10,X)
Bytes: 2
Cycles: 6
Effective Address: memory[$00nn + X] | (memory[$00nn + X + 1] << 8)
Example: LDA ($80,X)  ; addr = ZP[$80 + X], A = memory[addr]

12. Indirect Indexed (Indirect),Y

Address from zero page, then indexed by Y.

Syntax: LDA ($10),Y
Bytes: 2
Cycles: 5 (+1 if page crossed)
Effective Address: (memory[$00nn] | (memory[$00nn + 1] << 8)) + Y
Example: LDA ($80),Y  ; addr = ZP[$80] + Y, A = memory[addr]

13. Relative

Used by branch instructions. Signed 8-bit offset from PC.

Syntax: BEQ label
Bytes: 2
Cycles: 2 (+1 if branch taken, +2 if page crossed)
Effective Address: PC + signed_offset
Example: BEQ $02      ; if Z=1, PC = PC + 2

Instruction Set

Official Instructions (56 opcodes)

Load/Store Operations

Mnemonic Description Flags Example
LDA Load Accumulator N, Z LDA #$42
LDX Load X Register N, Z LDX $80
LDY Load Y Register N, Z LDY $80,X
STA Store Accumulator - STA $8000
STX Store X Register - STX $80
STY Store Y Register - STY $80

Transfer Operations

Mnemonic Description Flags Cycles
TAX Transfer A to X N, Z 2
TAY Transfer A to Y N, Z 2
TXA Transfer X to A N, Z 2
TYA Transfer Y to A N, Z 2
TSX Transfer SP to X N, Z 2
TXS Transfer X to SP - 2

Stack Operations

Mnemonic Description Flags Cycles
PHA Push Accumulator - 3
PHP Push Processor Status - 3
PLA Pull Accumulator N, Z 4
PLP Pull Processor Status All 4

Arithmetic Operations

Mnemonic Description Flags Notes
ADC Add with Carry N, V, Z, C A = A + M + C
SBC Subtract with Carry N, V, Z, C A = A - M - (1-C)
INC Increment Memory N, Z M = M + 1
INX Increment X N, Z X = X + 1
INY Increment Y N, Z Y = Y + 1
DEC Decrement Memory N, Z M = M - 1
DEX Decrement X N, Z X = X - 1
DEY Decrement Y N, Z Y = Y - 1

Logical Operations

Mnemonic Description Flags Formula
AND Logical AND N, Z A = A & M
ORA Logical OR N, Z A = A | M
EOR Exclusive OR N, Z A = A ^ M
BIT Bit Test N, V, Z N=M7, V=M6, Z=(A&M==0)

Shift/Rotate Operations

Mnemonic Description Flags Operation
ASL Arithmetic Shift Left N, Z, C C <- [76543210] <- 0
LSR Logical Shift Right N, Z, C 0 -> [76543210] -> C
ROL Rotate Left N, Z, C C <- [76543210] <- C
ROR Rotate Right N, Z, C C -> [76543210] -> C

Comparison Operations

Mnemonic Description Flags Operation
CMP Compare Accumulator N, Z, C A - M
CPX Compare X Register N, Z, C X - M
CPY Compare Y Register N, Z, C Y - M

Branch Instructions

All branches take 2 cycles if not taken, 3 if taken (same page), 4 if taken (different page).

Mnemonic Description Condition
BCC Branch if Carry Clear C = 0
BCS Branch if Carry Set C = 1
BEQ Branch if Equal (Zero) Z = 1
BNE Branch if Not Equal Z = 0
BMI Branch if Minus N = 1
BPL Branch if Plus N = 0
BVC Branch if Overflow Clear V = 0
BVS Branch if Overflow Set V = 1

Jump/Subroutine

Mnemonic Description Cycles Stack Effect
JMP Jump (Absolute) 3 None
JMP Jump (Indirect) 5 None
JSR Jump to Subroutine 6 Push PC-1 (2 bytes)
RTS Return from Subroutine 6 Pull PC, PC = PC + 1

Interrupt/Break

Mnemonic Description Cycles Stack Effect
BRK Break 7 Push PC+2, Push P|0x10
RTI Return from Interrupt 6 Pull P, Pull PC

Flag Operations

Mnemonic Description Flag Value
CLC Clear Carry C 0
SEC Set Carry C 1
CLI Clear Interrupt Disable I 0
SEI Set Interrupt Disable I 1
CLV Clear Overflow V 0
CLD Clear Decimal (no effect) D 0
SED Set Decimal (no effect) D 1

No Operation

Mnemonic Description Cycles
NOP No Operation 2

Interrupt Handling

The 6502 supports 3 types of interrupts:

RESET

Highest priority, non-maskable.

Trigger: Power-on or reset button
Duration: 7 cycles
Vector: $FFFC-$FFFD
Stack: No push (SP decremented by 3 but nothing written)
I Flag: Set
Effect: PC = memory[$FFFC] | (memory[$FFFD] << 8)

NMI (Non-Maskable Interrupt)

Second priority, cannot be disabled.

Trigger: PPU VBlank (start of scanline 241)
Duration: 7 cycles
Vector: $FFFA-$FFFB
Stack: Push PCH, Push PCL, Push P (B=0)
I Flag: Set
Effect: PC = memory[$FFFA] | (memory[$FFFB] << 8)

Cycle Breakdown:

Cycle 1-2: Read next instruction (dummy)
Cycle 3:   Push PCH
Cycle 4:   Push PCL
Cycle 5:   Push P (with B=0, U=1)
Cycle 6:   Fetch vector low byte
Cycle 7:   Fetch vector high byte

IRQ (Interrupt Request)

Lowest priority, maskable via I flag.

Trigger: Mapper IRQ, APU frame counter IRQ
Duration: 7 cycles
Vector: $FFFE-$FFFF
Stack: Push PCH, Push PCL, Push P (B=0)
I Flag: Set
Masked: When I flag is set
Effect: PC = memory[$FFFE] | (memory[$FFFF] << 8)

Interrupt Priority

RESET > NMI > IRQ

If multiple interrupts occur simultaneously, RESET takes precedence, then NMI, then IRQ.

BRK Instruction

Software interrupt, similar to IRQ but sets B flag.

Opcode: $00
Duration: 7 cycles
Vector: $FFFE-$FFFF (same as IRQ)
Stack: Push PCH, Push PCL, Push P (B=1)
I Flag: Set
PC: Incremented by 2 before push (skips signature byte)

DMA Operation

OAM DMA ($4014)

Writing to $4014 triggers a Direct Memory Access of 256 bytes to PPU OAM.

Write Value: Page number ($00-$FF)
Source: $XX00-$XXFF (where XX is written value)
Destination: PPU OAM ($2004)
Duration: 513 or 514 CPU cycles
Effect: CPU halted during transfer

Cycle Breakdown:

Cycle 1:     Dummy read (or write) - alignment
Cycle 2-513: 256 reads + 256 writes (alternating)
             Read from $XX00 -> Write to $2004
             Read from $XX01 -> Write to $2004
             ... (256 times)

Alignment:

  • If DMA triggered on odd CPU cycle: 513 total cycles (1 dummy + 512)
  • If DMA triggered on even CPU cycle: 514 total cycles (2 dummy + 512)

DMC DMA (Sample Playback)

The APU's Delta Modulation Channel can also steal CPU cycles:

Every sample byte fetch: 4 CPU cycles stolen
Frequency: Based on DMC rate (varies)
Effect: Can interfere with controller reads and exact timing

Implementation Guide

Recommended Structure

pub struct Cpu {
    // Registers
    pub a: u8,
    pub x: u8,
    pub y: u8,
    pub sp: u8,
    pub pc: u16,
    pub p: Status,

    // Internal state
    cycles: u64,
    nmi_pending: bool,
    irq_pending: bool,
    irq_line: bool,

    // Lookup tables (for performance)
    instruction_table: [InstructionFn; 256],
    addressing_mode_table: [AddressingMode; 256],
    cycle_table: [u8; 256],
}

Execution Loop

pub fn step(&mut self, bus: &mut Bus) -> u8 {
    // Check for interrupts
    if self.nmi_pending {
        return self.handle_nmi(bus);
    }
    if self.irq_pending && !self.p.contains(Status::INTERRUPT) {
        return self.handle_irq(bus);
    }

    // Fetch opcode
    let opcode = self.read(bus, self.pc);
    self.pc = self.pc.wrapping_add(1);

    // Dispatch
    let addr_mode = self.addressing_mode_table[opcode as usize];
    let instruction = self.instruction_table[opcode as usize];
    let base_cycles = self.cycle_table[opcode as usize];

    // Execute (returns extra cycles for page crossing)
    let extra_cycles = instruction(self, bus, addr_mode);

    base_cycles + extra_cycles
}

Testing Strategy

Essential Test ROMs:

  • nestest.nes - Golden log comparison (all instructions)
  • instr_test-v5 - Comprehensive instruction tests
  • cpu_interrupts_v2 - NMI/IRQ timing
  • cpu_dummy_reads - Bus behavior verification

Related Documentation


References