Running the MIPS 5-stage Pipeline processor on a DE10-Nano FPGA (Part 3)

Lena
8 min readJan 27, 2023

--

In this post, I will be talking about the steps I took to run the MIPS 5-stage pipeline processor on a DE10-Nano FPGA, build a breadboard circuit for the external I/O, and implement a user-input handler that allows user input into a program run by the processor.

Part 1: Building a MIPS single-cycle processor in Verilog

Part 2: Building a MIPS 5-stage Pipeline processor in Verilog

Part 3: Running the MIPS 5-stage Pipeline processor on a DE10-Nano FPGA

Table of contents

  1. Running the MIPS 5-stage pipeline processor on the FPGA, preparing the external circuitry
  2. Adding a user input functionality: Modifying the instruction memory module
  3. Adding a user input functionality: Using an address decoder

Running the MIPS 5-stage pipeline processor on the FPGA, preparing the external circuitry

In my previous blog post, I finished implementing the MIPS 5-stage pipeline processor using SystemVerilog, and ran tests on ModelSim.

The next step is to test it on the FPGA, but the DE10 nano had really limited user inputs and outputs (8 LEDs, 2 button switches, 4 slide switches).

So I made a breadboard circuit by using the FPGA’s GPIO. For more information on how to use an FPGA’s GPIO to allow external I/O, please check out the following blog post:

This is the breadboard circuitry I used to test the MIPS 5-stage pipeline processor,

The DE10 Nano has 2 sets of GPIO, so I used both of them,

I added the 7-segment decoder to the SystemVerilog code,

module seg7decoder(input logic [3:0] n, 
output logic [7:0] seg);
always_ff @(n)
begin
case(n)
4'b0000: seg = 8'b11100111;
4'b0001: seg = 8'b01100000;
4'b0010: seg = 8'b11001011;
4'b0011: seg = 8'b11101001;
4'b0100: seg = 8'b01101100;
4'b0101: seg = 8'b10101101;
4'b0110: seg = 8'b00101111;
4'b0111: seg = 8'b11100000;
4'b1000: seg = 8'b11101111;
4'b1001: seg = 8'b11101100;
4'b1010: seg = 8'b11101110;
4'b1011: seg = 8'b00101111;
4'b1100: seg = 8'b10000111;
4'b1101: seg = 8'b01101011;
4'b1110: seg = 8'b10001111;
4'b1111: seg = 8'b10001110;
default: seg = 8'b00000000;
endcase
end
endmodule

And initialized the register file,

initial
$readmemh("/home/ln2/Desktop/IoT/MIPSproject/Pipeline/initialreg.dat",rf)

The contents of initialreg.dat :

  00000000
00000000
00000000
...
00000000
00000000
00000000

I added the input and output logic to the top module.

module top(input  logic        clkbutton, resetsw, 
output logic [7:0] Rseg1=8'b0, Rseg2=8'b0,
output logic [2:0] Rcontent=3'b0,
output logic [7:0] Mseg2=8'b0,
output logic [2:0] Mcontent=3'b0,
output logic lwstall, BranchTaken, BranchNotTaken,
output logic [7:0] countsegL = 8'b11100111, countsegR = 8'b11100111,
output logic [7:0] pcsegL = 8'b11100111, pcsegR = 8'b11100111);

I then prepared the following instructions for testing. The branch is evaluated during the ID stage.

  ADDI $t0 $zero 0x1
ADDI $s2 $zero 0x2
SUB $t1 $s2 $t0 //$t1 should have 1
BEQ $t0 $s2 0x4 //not taken
BEQ $t0 $t1 0x5 //taken
ADDI $s0 $zero 0x1
ADDI $s0 $s0 0x1
ADDI $s0 $s0 0x1
ADDI $s0 $s0 0x1
ADDI $s0 $s0 0x1
SW $s2 0x8($zero) //BEQ will go to here
SLT $s1 $t0 $s2
LW $s3 0x8($zero)
ADD $s4 $s3 $s1

In hex,

  20080001
20120002
02484822
11120004
11090005
20100001
22100001
22100001
22100001
22100001
AC120008
0112882A
8C130008
0271A020

I then uploaded it onto the DE10 Nano FPGA, and ran the test. Every time I press the DE10 Nano’s button switch, it runs one clock cycle.

The video demo can be found here,

The demo shows the following outputs, which confirm that the MIPS 5-stage pipeline works as expected on the FPGA.

0:  PC = 00
1:  PC = 04
2:  PC = 08
3:  PC = 0C
4:  PC = 10, $t0 = 1
5:  PC = 14, $s2 = 2, BranchNotTaken = 1
6:  PC = 28, $t1 = 1, BranchTaken = 1
7:  PC = 2C
8:  PC = 30
9:  PC = 34
A:  PC = 38, 0x8 has 2, lwstall = 1
B:  PC = 38, $s1 = 1
C:  PC = 3C, $s3 = 2
D:  PC = 40
E:  PC = 44, $s4 = 3

Adding a user input functionality: Modifying the instruction memory module

Next, I decided to add a user input functionality to the MIPS 5-stage pipeline processor. For example, if calculating a Fibonacci sequence, allow the user to input a number that specifies which Fibonacci number the processor should calculate until.

The high-level Fibonacci code looks like the following, where n is the USERINPUT .

int n=USERINPUT;
int i=2;
fibo=1;
fiboprev=1;
while(true){
if(i==n) break;
int temp = fibo;
fibo = fibo + fiboprev;
fiboprev = temp;
}

And looks like the following in Assembly language,

        //$t0 = n, $t1 = i, $s0 = fibo, $s1=fiboprev, $t2=temp

addi $t0 $zero 0xA
addi $t1 $zero 0x2
addi $s0 $zero 0x1
addi $s1 $zero 0x1
beq $t1 $t0 0x6
add $t2 $s0 $zero
add $s0 $s0 $s1
add $s1 $t2 $zero
addi $t1 $t1 0x1
j 0x4
addi $t3 $zero 0x8

One of the solutions I came up with was to allow user input in the first addi $t0 $zero 0xA instruction (user input will be the immediate field), so that $t0 will have the UserInput that specifies n . So I added the following logic to the instruction memory module,

This userin is 1 only for the first instruction, so the first instruction’s immediate field will be the 4-bit user switch input (green). Zeros will be padded to this 4-bit user input to fill 16 bits. Then the 16-bits of the instruction will be appended to form a 32-bit “instruction”.

module imem(input logic [3:0] swinput,
input clk,
input logic [5:0] a,
output logic [31:0] instr);
logic userin = 1'b1;
logic [15:0] low;
logic [31:0] rd;

logic [31:0] RAM[63:0];

//always_ff @(a) $display("pc: %h", a);
initial
$readmemh("/home/ln2/Desktop/IoT/MIPSproject/Pipeline/fibo.dat",RAM);
assign rd = RAM[a]; // word aligned
mux2 #(16) userinput(rd[15:0], {12'b0, swinput[3:0]}, userin, low);

always_ff @(a)
begin
$display("a=%b", a);
if(a==6'b0)
userin=1'b1;
else
userin=1'b0;
end

assign instr = {rd[31:16], low[15:0]};
always_ff @(posedge clk)
begin
if(rd!=8'hx)
$display("Fetched instruction %h", rd);
end
endmodule

For the test, I inputted 1101 , so it will calculate until the 13th Fibonacci number. It calculated the following in hex,

1, 1, 2, 3, 5, 8, 0D, 15, 22, 37, 59, 90, E9

This shows that the implementation works as expected.

Although my implementation of adding a few circuitry to the instruction memory module works, it is not flexible, as the user input addi instruction must be the very first instruction.

This is where memory-mapped I/O comes in, and I will cover it in the next section.

Adding a user input functionality: Using an address decoder

This implementation of adding an address decoder to allow user input is more flexible than modifying the instruction memory module. I added the address decoder that handles external I/O. It will read the user input from the 4-bit slide switch.

I specified the 4-bit slide switch input’s address to be FFFFFFF4 , and added the IO and addressdecoder module.

module IO(input logic WESW, clk,
input logic [3:0] swinput,
output logic [31:0] IOout);

logic [31:0] IOmem;


assign IOout = IOmem;


logic [3:0] swQ;
always_ff @(posedge clk)
begin
if(WESW)
begin
IOmem <= swinput;

end

end
endmodule
module addressdecoder(input logic memwriteM, 	
input logic [31:0] aluoutM,
output logic WEM, WESW, RDselect);

always_ff @(memwriteM, aluoutM)
begin
if((memwriteM==1'b1)&&(aluoutM!=32'b11111111111111111111111111110100))
begin
WEM <= 1'b1;
WESW <= 1'b0;
end
else if((memwriteM==1'b1)&&(aluoutM==32'b11111111111111111111111111110100))
begin
WEM <= 1'b0;
WESW <= 1'b1;
end
else if ((memwriteM==1'b0)&&(aluoutM!=32'b11111111111111111111111111110100))
begin
WEM <= 1'b0;
WESW <=1'b0;
RDselect <= 1'b0;
end
else if ((memwriteM==1'b0)&&(aluoutM==32'b11111111111111111111111111110100))
begin
WEM <= 1'b0;
WESW <=1'b0;
RDselect <= 1'b1;
end
end
endmodule

To test the Fibonacci program, I prepared the following assembly instructions, where I load and store the 4-bit switch input contents.

sw $zero 0xFFF4($zero) //store the 4-bit switch number into 0xFFF4
addi $t1 $zero 0x2
lw $t0 0xFFF4($zero) //load from 0xFFF4, which contains the 4-bit switch number
addi $s0 $zero 0x1
addi $s1 $zero 0x1
beq $t1 $t0 0x6
add $t2 $s0 $zero
add $s0 $s0 $s1
add $s1 $t2 $zero
addi $t1 $t1 0x1
j 0x5
addi $t3 $zero 0x8

I modified the circuit, and this time it will run on a 10 Hz clock (The DE10 Nano has a clock of 50 Mhz, so I put that through a clock divider to produce a 10 Hz clock).

I ran the tests using the following test cases:

  • First, 0110 is specified in the 4-bit switch input so it will calculate until the 6th Fibonacci number
  • Next, reset then 1101 is specified in the 4-bit switch input so it will calculate until the 13th Fibonacci number

The full demo can be found in the following video. The demo shows that this implementation works as expected.

The full code can be found here:

This concludes the three-part series of building the MIPS single-cycle processor, extending it into a 5-stage pipeline, and running it on a DE10 Nano FPGA.

Part 1: Building a MIPS single-cycle processor in Verilog

Part 2: Building a MIPS 5-stage Pipeline processor in Verilog

Part 3: Running the MIPS 5-stage Pipeline processor on a DE10-Nano FPGA

Thank you for reading!

--

--

Lena
Lena

Written by Lena

I'm a Cybersecurity Analyst! My passions include hacking, investigations, writing, and drawing! Contact: lambdamamba@proton.me, Website: LambdaMamba.com

Responses (1)