Jeremie Amano - Writing a MIPS Emulator with Typescript

Writing a MIPS Emulator with Typescript

Learning Typescript by making a MIPS Emulator

At the time this article was written, I was enrolled in a Computer Architecture & Design course. We started learning about the MIPS architecture. I thought, "What better way to learn MIPS than write an emulator."

I did just that, writing a basic emulator in C++. However, I couldn't showcase it online, so I rewrote it in Typescript. Typescript also made it easier to write an assembler, as string manipulation is much easier. A live demo can be found here and the source here.

Writing the emulator

To begin writing the emulator, I created a class that represents a MIPS CPU

export class mips {
	private programSize: number;
	private pc: number;
	private memory: number[];
	private registers: number[];
	private exitStatus: boolean;
	private stdOut: (out: any) => void;
	private stdIn: () => any;
	private stdErr: (err: any) => void;

programSize was translated directly from my C++ emulator. Because Javascript arrays aren't of fixed size, it doesn't really serve a purpose here other than to stop the program from running endlessly if it doesn't exit. pc is the program counter, representing (in this case) the next instruction in memory to run. memory is an array of 32-bit numbers, matching the word size for MIPS. Each number could be a MIPS instruction or data. registers hold the thirty-two 32-bit registers. exitStatus is true if the program has terminated, and the stdX callbacks take advantage of functional programming by providing the class user freedom on how they want to handle the IO streams from the MIPS program.

Interpreting an instruction

Now that we've set up our CPU properties, let's write code to interpret one instruction. We'll start with a step function to step to the next instruction.

public step() {
	if(!this.exitStatus && this.pc < this.programSize) {
		this.registers[0] = 0;
		this.execute(this.memory[this.pc]);
	}
}

We check if the program hasn't already signaled to exit, and that the program counter is less than the program's size. We reset the zero register to 0 as it's the defined behavior in MIPS. Finally, we execute the instruction found at the program counter.

I won't show how to interpret every instruction, but I will show each instruction format. Let's start with some constants that we will be using.

const vReg = 2;
const aReg = 4;
const insShift = 26;
const rsShift = 21;
const rtShift = 16;
const rdShift = 11;
const shmShift = 6;
const immMagnitude = 1 << 16;
const immMask = ((1 << 16) - 1);
const immSignMask = (1 << 15);
const signBitMask = (1 << 31);
const rsMask = ((1 <<5) - 1) <<rsShift
const rtMask = ((1 << 5) - 1) << rtShift;
const rdMask = ((1 << 5) - 1) << rdShift;
const shmMask = ((1 << 5) - 1) << shmShift;
const funMask = ((1 << 6) - 1);
const adrMask = ((1 << 26) - 1);

These constants will help us to isolate specific information out of the instruction. I used this Wikipedia article to figure these constants out. We also have a vReg and aReg constants which are the indices of the value and argument register, respectively. We can now begin writing out our execute function.

private execute(instruction: number) {
instruction &= 0xFFFFFFFF;
let opcode = instruction >>> insShift;

We make sure that the instruction is 32-bits, then we isolate the instruction's opcode by bitshifting the instruction to the right to discard the other parts we don't need yet. Now we can interpret the opcode and act accordingly. We'll start with an R-Type instruction whose opcodes are 0x00.

switch(opcode) {
case 0x00: {
	let rs = (instruction & rsMask) >> rsShift;
	let rt = (instruction & rtMask) >> rtShift;
	let rd = (instruction & rdMask) >> rdShift;
	let shm = (instruction & shmMask) >> shmShift;
	let fun = instruction & funMask;
	this.r(opcode, rs, rt, rd, shm, fun);
	break;
}
...

We isolate the other parts of the instruction using the masks and shifts we defined earlier. Then we call a separate function to handle R-Type isntructions. The fun number tells us what kind of R-Type it is.

private r(opcode: number, rs: number, rt: number, rd: number, shm: number, fun: number) {
	switch(fun) {
		...
		case 0x20: { //ADD
			this.registers[rs] = this.registers[rt] 
				+ this.registers[rd] & 0xFFFFFFFF;
			this.pc++;
			break;
		}
		...
	}
}

Here we interpret an "Add" instruction whose fun number is 0x20. The destination register rs is assigned the sum of the registers rt and rd. We then increment the program counter to the next instruction.

A special R-Type instruction are system calls which interact with a computer's OS kernel. We're not really running a complete OS with this, so we'll just create some basic calls to write to the stdOut "stream".

switch(fun) {
	...
	case 0x0C: { //Syscall
		switch(this.registers[vReg]) {
			case 0x00000001: {
				this.stdOut(this.registers[aReg]);
				break;
			}
			...
			this.pc++;
		}
	}
	...
}

We take the value located in the vReg register, and use the correct system call. For a vReg of 1, we can output the value located in aReg to the output "stream".

That's the gist for interpreting R-Type instructions, now we can work on interpreting I-Type instructions. These actually have different opcodes, so we can build off of our opcode switch-case.

switch(opcode) {
	...
	case 0x08: { //ADDI
		let rs = (instruction & rsMask) >> rsShift;
		let rt = (instruction & rtMask) >> rtShift;
		let imm = instruction & immMask;
		imm = (imm & immSignMask) ? immMagnitude - imm : imm;
		this.registers[rs] = this.registers[rt] + imm & 0xFFFFFFFF;
		this.pc++;
		break;
	}
	...
}

The "Add Immediate" instruction differs from the add instruction in that we add a number instead of a value inside of a register. Register rs is assigned the sum of rt and the immediate number. We take the first bit of the immediate value to determine it's sign. All I-Type instructions have an immediate value instead of a third register that take up 16-bits.

The final type is a J-Type. These only have the opcode, and a 26-bit target to jump the program counter too

switch(opcode) {
	...
	case 0x02: { //J
		let adr = (instruction & adrMask) >>> 0;
		this.pc = (this.pc & 0xF0000000) | adr;
		break;
	}
	...
}

Now that we have written the interpreter, we can load the instructions into our memory array and step through the whole program. It's extremely tedious to write out the instructions by hand in hex, so we should write an assembler next.

What I've learned

I've learned a lot through this project, mostly in that getting integer arithmetic in JS is wonky, as every number is a floating point number. There are many improvements I could make to the emulator. I could make the class immutable, so that each step represents a different state of the CPU. There are many instructions that I've yet to implement. Overall, it was a great learning experience, and I believe I've gotten a better understanding of MIPS and computer architecture in general.