RISC-V is an open-source RISC-based ISA that was started in 2010 at the University of California, Berkeley.
The motivation to develop an open-source ISA was to promote contributions (mainly from academic institutions) that is royalty-free, as well as having a simpler ISA that does not require overengineering the microarchitecture. Over time, there has been multiple contributions to the RISC-V project, in terms of ISA development, toolchain development as well as the software ecosystem. RISC-V is now managed by the non-profit organization, RISC-V International.
Given that RISC-V is a RISC-based ISA, and that one of the collaborators was David Patterson (co-author of the popular book ‘Computer Architecture: A Quantitative Approach’), you’ll find lots of similarities between MIPS and RISC-V. Similar to MIPS, RISC-V follows a register addressing mode and has the same types of instructions.
Registers and Calling Convention of RISC-V
Similar to every other ISA, RISC-V also has a set of architectural registers. Architectural registers are the set of registers that are visible to the programmer and define the architectural state of the CPU. This is in contrast to physical registers, that are hidden from the programmer and are solely a microarchitecture feature. More on that in an upcoming blog.
RISC-V has evolved a lot over the years, and so have the types of registers used. The following are the register types in RISC-V:
- GPRs (General-Purpose Registers): These registers (more commonly known as Integer Registers) store data in signed/unsigned integer format. A lot of data that is processed in computers are processed in integer format, hence the use of integer registers.
- FPRs (Floating-Point Registers): These registers store data in floating-point form, and conform to the IEEE-754 standard. A lot of scientific, financial and graphical calculations use floating-point arithmetic. IEEE-754 in itself is a huge and very interesting topic, and it deserves an entire series to itself, something I plan to do in the future 🙂
- CSRs (Control and Status Registers): These registers are used for various system-level operations such as mode-switching, privileged access, performance counters, dealing with exceptions and so on.
Integer Registers
RISC-V has 32 integer registers, which are named x0
— x31
. x0
is the only integer register that is a read-only register, and is hardcoded to value 0
. This is done to simplify a lot of instructions as well as simplify the hardware implementation.
While there is no compulsion on using certain registers for certain operations, there is a convention used in RISC-V programs (and followed by the compiler when generating RISC-V executables):
The encoding of integer registers is straightforward: Register x{n}
has encoding n
in binary
Floating-Point Registers
There are 32 FPRs (f0
— f31
), where each register is of size 32 bits (if using the F extension) or 64 bits (if using the D extension). Each register stores values that conform with the IEEE-754 standard.
Control and Status Registers
RISC-V has a lot of CSRs, categorized by privilege level. Privilege level is a slightly advanced topic that requires a fair understanding of the basics of Operating Systems, and I plan to write a post on this sometime in the future. Each CSR in RISC-V has an encoding associated with it, in addition to a privilege. Here’s an example of how some CSRs are defined:
Types of RISC-V Instructions
- Register/Register instructions: These type of instructions have source and destination operands as registers only. Every instruction of this type has at most 2 source operands (two source registers), and one destination register.
A lot of arithmetic and logical instructions are register/register type, since a lot of arithmetic and logical operations (add, subtract, multiply, divide, and, or, xor, etc.) are binary operations. - Immediate-type instructions: These types of instructions have one of the operands encoded directly in the instruction itself (which is known as ‘immediate value’, since it is available “immediately” without having to read the register file).
A lot of instructions in real-world applications tend to have atleast one constant operand. Take the for-loop in C for example:for(int i = 0; i < 10; i++) {sum += i;}
In this case, we incrementi
by1
in each iteration. The best way to do it in RISC-V assembly would be by usingaddi x10, x10, 1
.
Load instructions also fall under immediate-type. A typical load instruction would look likelw x11, 0(x2)
. This instruction says that the CPU should load the data from memory addressx2 + 0
and write it to registerx11
. Since the instruction is alw
(load word), this instruction would load 32 bits (4 bytes / 1 word) of data from memory. Since the start address isx2 + 0
, and memory is byte-addressable, the end memory address would bex2 + 3
. - Store instructions: Like load instructions, store instructions also deal with data. While load instructions read from memory, store instructions write to memory. A typical store instruction would look like
sw x11, 0(x2)
. This instruction says that the CPU should read the data from registerx11
and write it to memory addressx2 + 0
. Since the instruction is asw
(store word), this instruction would write 32 bits (4 bytes / 1 word) of data into memory. Since the start address isx2 + 0
, the end memory address would bex2 + 3
.
Even though load and store instructions access memory, they are considered different types of instructions. This is because when processing a load instruction, the CPU reads only one register (in our examplex11
) to obtain the destination where it needs to write the data to. However when processing a store instruction, the CPU needs to read two registers (in our example, it readsx11
to obtain the data it needs to write to memory, and it readsx2
to figure out which memory address to write to). - Branch instructions: Branch instructions allow the processor to change the current execution flow, based on a condition. Branch instructions are found when the user writes conditional code in high-level languages, such as
if
statements,while
statements,for
statements, and even ternary conditional expressions.
Taking the example mentioned below: The instruction at PC0x100c4
(blt a1, a2, 100bc
) checks whether the value ina1
is less than the value ina2
, and if so, transfers the instruction flow to0x100bc
. Essentially it changes the PC from0x100c4
to0x100bc
.
Branches usually jump a few instructions forward or backward from the current PC. As a result, it is more convenient to encode the relative difference between current PC and target PC into the instruction, rather than encoding the entire PC into the instruction.
- Upper immediate type instructions: Upper immediate type instructions are a special class of instructions which deal with comparatively larger data. These instructions write data to the upper bits of the destination register while the lower bits remain untouched.
An example of this type of instruction is thelui
instruction, used to load a larger value into a register without the use of shift operations. Another example is theauipc
instruction. If the instruction at PC0x80001fc0
isauipc x2
, the value0x80001000
is written tox2
. This is widely used to load addresses into registers, which are a result of reading from/writing to variables in high-level languages - Jump instructions: Like branch instructions, jump instructions also change the current execution flow. However, they don’t depend on any conditions, and unconditionally jump to the target address. Jump instructions are used in function calls, where a jump to the function address space does not depend on any condition.
Now that you have a fair idea about assembly and RISC-V, you’re ready to dive into the fundamentals of microarchitecture!