Building a MIPS 5-stage Pipeline processor in Verilog (Part 2)

Lena
7 min readJan 27, 2023

--

In this blog post, I’ll be talking about the steps I took to extend the MIPS single-cycle processor into a 5-stage pipeline.

Part 1: Building a MIPS single-cycle processor in Verilog

Part 2: Building a MIPS 5-stage Pipeline processor in Verilog

Part 3: Running the MIPS 5-stage Pipeline processor on a DE10-Nano FPGA

Table of contents

  1. Adding the pipeline registers
  2. Adding the forwarding functionality
  3. Adding the Load Word data hazard handler
  4. Adding the Branch data dependency handler

Adding the pipeline registers

In my previous blog post, I went through the steps I took to build a MIPS single-cycle processor in Verilog, test on ModelSim, and implement a BNE instruction.

A MIPS pipeline consists of 5 stages, Instruction Fetch (IF), Instruction Decode (ID), Execute (EX), Memory (MEM), and Writeback (WB). It has 4 pipeline registers as shown below,

Image taken from Digital Design and Computer Architecture by David Harris, Sarah L. Harris

The complete MIPS 5-stage pipeline processor design with the controls looks like the following,

Image taken from Digital Design and Computer Architecture by David Harris, Sarah L. Harris

To extend the MIPS single-cycle processor into a pipeline, I first added the pipeline registers. (I removed the PC MUX for now) Note that this 5-stage pipeline cannot use the branch instruction, and does not resolve hazards.

    IF_ID IF_ID(clk, instr, pcplus4, instrD, pcplus4D);

ID_EX ID_EX(clk, instrD, regwrite, memtoreg, memwrite, branch, alucontrol, alusrc,
regdst, srca, writedata, instrD[20:16], instrD[15:11], signimmD, pcplus4D,
regwriteE, memtoregE, memwriteE, branchE, alucontrolE,alusrcE,
regdstE, srcaE, writedataE, rtE, rdE, signimmE, pcplus4E, instrE);


EX_MEM EX_MEM(clk, instrE, regwriteE, memtoregE, memwriteE, branchE,
zero, aluout, writedataE, writeregE, pcbranch,
regwriteM, memtoregM, memwriteM, branchM,
zeroM, aluoutM, writedataM, writeregM, pcbranchM, instrM);

MEM_WB MEM_WB(clk, instrM, regwriteM, memtoregM, aluoutM, readdata, writeregM,
regwriteW, memtoregW, aluoutW, readdataW,writeregW, instrW);

I used the following instructions to test the implementation, I prepared the instructions in a way no forwarding is required,

addi $s1 $zero 1
addi $s2 $zero 2
addi $s3 $zero 3

In hex,

20110001
20120002
20130003

I used the following test bench,

module testbench2();

logic clk;
logic reset;

logic [31:0] writedata, dataadr;
logic memwrite;

// instantiate device to be tested
top dut(clk, reset, writedata, dataadr, memwrite);

// initialize test
initial
begin
reset <= 1; # 1; reset <= 0;
end

// generate clock to sequence tests
always
begin
clk <= 1; # 5; clk <= 0; # 5;
end
endmodule

I used runto run 10 ps at a time (the clock period is 10 ps). I used the $display() to output the program counter, and show which pipeline stage the instruction is in within the console.

The SystemVerilog code for the 5-stage pipeline MIPS processor with only the pipeline registers can be found here:

Adding the forwarding functionality

The next step was to allow hazards to be resolved. I added a forwarding functionality,

I added an additional multiplexor that is controlled by the jump signal (red multiplexor on the diagram above).

mux_dontcare pcmux(pcnextbr, {pcplus4[31:28], instrD[25:0], 2'b00}, jump, pcnext);

There are also 2 new multiplexors that use the Hazard unit control signals,

  mux_dontcare3 muxsrca(srcaMUX, result, aluoutM, forwardAE, srcaE);
mux_dontcare3 muxwritedata(writedataMUX, result, aluoutM, forwardBE, writedataE);

The hazard unit looks like the following,

hazardunit hazardunit(regwriteM, regwriteW, rsE, rtE, writeregM, writeregW,
forwardAE, forwardBE);

always_ff @(forwardAE)
begin
case(forwardAE)
2'b01: $display("Forwarded %h to srcaE from MEM/WB stage", result);
2'b10: $display("Forwarded %h to srcaE from EX/MEM stage", aluoutM);
endcase
end


always_ff @(forwardBE)
begin
case(forwardBE)
2'b01: $display("Forwarded %h to writedataE from MEM/WB stage", result);
2'b10: $display("Forwarded %h to writedataE from EX/MEM stage", aluoutM);
endcase
end

endmodule

I prepared the test instructions to test the forwarding functionality,

  • The EX result of the addi $s1 $zero 1 instruction will be forwarded to the EX stage of the addi $s2 $s1 2 instruction as it uses $s1
  • The MEM result of the addi $s1 $zero 1 instruction, and the EX result of the addi $s2 $s1 2 instruction will be forwarded to the EX stage of the addi $s3 $s1 $s2 instruction as it uses both $s1 and $s2
ADDI $s1 $zero 0x1
ADDI $s2 $s1 0x2
ADD $s3 $s1 $s2 //$s3 should have 4 in the end

In hex,

20110001
22320002
02329820

On each run, it runs one clock and also outputs the forwarding information. In the end, $s3 is 4, so this was confirmed to run as expected.

The SystemVerilog code for the 5-stage pipeline MIPS processor with the forwarding can be found here:

Adding the Load Word data hazard handler

The lw (load word) data is only available after the MEM stage. Therefore, we must add a stall functionality to the hazard unit.

I prepared instructions that contain lw to test the load word stall functionality as well as forwarding,

  • The instruction lw $s0, 0x4($zero) will forward the MEM result to the EX stage of add $t0, $s0, $s1 as it uses $s0. This instruction is right after the lw , so a stall cycle is inserted between the ID and EX stages.
  • A stall cycle is also inserted between the IF and ID stage of the add $t1, $s2, $s0
	ADDI $s1 $zero 0x1
SW $s1 0x4($zero)
ADDI $s2 $s1 0x2
LW $s0 0x4($zero)
ADD $t0 $s0 $s1
ADD $t1 $s2 $s0
ADD $t2 $s0 $s2

In hex,

	20110001
AC110004
22320002
8C100004
02114020
02504820
02125020

The output on ModelSim shows that lwstall occurred on cycle 6 ( lwstall=1 ), and in the end, $t2 is 4, which confirms that the load word stall functionality works as expected.

The SystemVerilog code for the 5-stage pipeline MIPS processor with the load word stall functionality can be found here:

Adding the Branch data dependency handler

To evaluate branches in the ID stage, we need to add a branch data dependency handler. Before the ID/EX register, an Equality unit is added.

The equality module in SystemVerilog looks like the following,

module equal(input logic [31:0] srca, writedata,
output logic equalD);
logic zero;

always_comb
begin
zero = srca-writedata;
if(zero==1'b0) equalD = 1'b1;
else equalD = 1'b0;
end
endmodule

I prepared instructions that contain beq to test the branch dependency handler,

  • The instruction beq $t0 $s2 0x4 result is not taken
  • The instruction beq $t0 $t1 0x5 result is taken, so the instruction addi $s0 $zero 0x1 will be flushed. It will jump to the instruction addi $s1 $zero 0x5
	ADDI $t0 $zero 0x1
ADDI $s2 $zero 0x2
SUB $t1 $s2 $t0 //$t1 should have 1
BEQ $t0 $s2 0x4 //not taken
BEQ $t0 $t1 0x5 //taken
ADDI $s0 $zero 0x1
ADDI $s0 $s0 0x1
ADDI $s0 $s0 0x1
ADDI $s0 $s0 0x1
ADDI $s0 $s0 0x1
ADDI $s1 $zero 0x5 //BEQ will go to here

In hex,

	20080001
20120002
02484822
11120004
11090005
20100001
22100001
22100001
22100001
22100001
20110005

The output on ModelSim shows that Branch not taken occurred in cycle 5, and Branch is taken occured in cycle 6. In the end $s1 is 5, and $s0 is not updated, which confirms that it runs as expected.

The SystemVerilog code for the 5-stage pipeline MIPS processor with the branch dependency handler can be found here:

In the next part, I will talk about running the MIPS pipeline processor on a DE10 Nano FPGA.

Part 1: Building a MIPS single-cycle processor in Verilog

Part 2: Building a MIPS 5-stage Pipeline processor in Verilog

Part 3: Running the MIPS 5-stage Pipeline processor on a DE10-Nano FPGA

--

--

Lena
Lena

Written by Lena

I'm a Cybersecurity Analyst! My passions include hacking, investigations, writing, and drawing! Contact: lambdamamba@proton.me, Website: LambdaMamba.com

No responses yet