CPU Work

I’ve been building quite a few projects and one of them is this CPU Simulation. I’ve been in the web world for 4 years, but i still did not understand what CPU is. So in the past week i start to build a small project CPU-Simulation that follow the behavior in the CPU.

Why the sudden interest in CPUs? Curiosity. There was a trigger after reading an article on how CPU work and they the work is quite simple, that got me thinking “hey, i can build that”.

Note: This is the article CPU and this one CPU

So, in CPU they have two important component called Control Unit and ALU (Arithmetic Logic Unit). At a high level, they follow a simple idea often described as “fetch and execute.” In more detail, the cycle looks like this:

  1. Control Unit
    This Control Unit responsible for fetch instruction from memory and decode the instruction into something ALU can read. by decoding means they translate the instruction into control signal that the ALU can act on.

  2. Arithmetic Logic Unit (ALU)
    ALU is responsible for execution the instruction that producing a result and storing the result in register or memory.

So to put it simple it more like fetch -> decode -> execute -> store -> next. That’s the whole cycle, the cycle repeat until all instruction is complete.

From this it’s tell me that, the computer work in a simple and predictable manner. The CPU does not care what program it given, it only cares whether an instruction is valid and whether it can be executed.

So i did manage to build the simulation using java, and it can be improve later. Here some code Github.

I use web browser to interact with the cpu, the cpu have 4 register index and 100 bytes of memory. But right now i only create small portion on how CPU work.

Here gist code CU & ALU

As you can see from github code the CU or Control Unit have it’s own class. The class contain few important part and there is fetch the program convert to stream and decode it, to something that Simulation ALU can read.

  private int resolveInstruction(Operand op) {

    // Convert to Number
    if (op instanceof LabelRef label) {
      Integer addr = labels.get(label.name());

      if (addr == null) {
        throw new IllegalStateException("Unknow Label :" + label.name());
      }

      return addr;
    }

    // Convert to Number
    if (op instanceof HexCode hx) {
      return Integer.decode(hx.value());
    }

    // Convert to Number
    if (op instanceof Immediate imm) {
      return imm.value();
    }

    // Convert to Number
    if (op instanceof RegisterCode rc) {
      return parseRegister(rc.value());
    }

    return 0;
  }

  private Instruction resolver(RawInstruction ins) {
    var instruction = new Instruction();
    instruction.opcode = ins.opcode;
    switch (ins.opcode) {
      case HALT:
        break;
      case JMP:
        instruction.dest = resolveInstruction(ins.dest);
        break;
      case STOREM:
      case LOADM:
        instruction.dest = resolveInstruction(ins.dest);
        instruction.src = resolveInstruction(ins.src);
        break;
      case LOAD:
        instruction.dest = resolveInstruction(ins.dest);
        instruction.src = resolveInstruction(ins.src);
        break;
      case CMP:
      case ADD:
        instruction.dest = resolveInstruction(ins.dest);
        instruction.src = resolveInstruction(ins.src);
        break;
      case JZ:
        instruction.dest = resolveInstruction(ins.dest);
        break;
      case null:
      default:
        break;
    }

    return instruction;
  }

You can see from function called resolver it convert every value into number. Because CPU does not talk other than number.

  void execute(Register reg, Memory mem, ArrayList<Instruction> program) {
    if (programCounter < 0 || programCounter >= program.size()) {
      halted = true;
      fault = CpuFault.INVALID_PC;
      return;
    }

    Instruction inst = program.get(programCounter);

    switch (inst.opcode) {
      case LOAD:
      case STOREM:
      case LOADM:
      case ADD: // register to register
        this.store(inst, reg, mem);
        break;
      case CMP:
        int temp = reg.r[inst.dest] - reg.r[inst.src];
        flag.zero = (temp == 0);
        flag.negative = (temp < 0);
        programCounter++;
        break;
      case JMP:
        programCounter = inst.dest - 1;
        break;
      case JZ:
        if (flag.zero) {
          programCounter = inst.dest - 1;
        } else {
          programCounter++;
        }
        break;
      case HALT:
        halted = true;
        break;
      case null:
      default:
        fault = CpuFault.ILLEGAL_INSTRUCTION;
        halted = true;
    }
  }

  private void store(Instruction inst, Register reg, Memory mem) {
    switch (inst.opcode) {
      case LOADM:
        if (inst.src > mem.ram.length) {
          halted = true;
          fault = CpuFault.INVALID_MEM;
          break;
        }
        reg.r[inst.dest] = mem.ram[inst.src];
        programCounter++;
        break;
      case STOREM:
        if (inst.src > mem.ram.length) {
          halted = true;
          fault = CpuFault.INVALID_MEM;
          break;
        }
        mem.ram[inst.src] = reg.r[inst.dest];
        programCounter++;
        break;
      case ADD:
        int res = reg.r[inst.dest] + reg.r[inst.src];
        reg.r[inst.dest] = res;
        flag.zero = (res == 0);
        flag.negative = (res < 0);
        programCounter++;
        break;
      case LOAD:
        reg.r[inst.dest] = inst.src;
        programCounter++;
        break;
      default:
        break;
    }
  }

And above code block is an important function in ALU Class, which contain execute function where it take the decoded instruction from Control Unit Class and execute it. And of course i add an error handler, the error handler in CPU are quite similar to most program, if there some illegal instruction or incorrect instruction they halted the program from execute. If you ever use a tool like sudo or other linux command it’s the same, if it’s error stop the program.

And other interesting part is the label tag like loop: or exit:. That label are just for parser to remember what line are they referring to. For example:

LOAD A, 1
LOAD B, 1

loop:
ADD A, B
CMP A, 7
JZ exit
JMP loop

exit:
HALT

above instruction are a simple instruction for loop if value reach zero stop the loop. Equivalent with


int A = 1
while (A != 7) {
    A += 1
}
return

If i swap the JZ exit with JMP loop it can cause infinite loop.

There are many more modern concepts in CPU design, but for now, I think a traditional CPU model is enough to understand the fundamentals.

Similar Posts