Serialization in SystemVerilog

Ivan Larkou
7 min readMar 25, 2023

--

Serialization/deserializaton is a powerful technique that helps to solve a lot of problems in general-purpose programming languages. To my regret, it’s not so widespread in design verification.

To increase awareness, I decided to show several usecases of serialization in SystemVerilog:

  • Transaction logs
  • Sequence generation
  • Foreign language communication

Transaction logs

Logs provide a great possibility for debugging testbenches and design. But some transactions(e.g. PCIe transaction logs) can have loads of extra information, that makes it difficult to work with the log without any post-processing. So, sometimes we need advanced features such as log search and filter.

Unfortunately, the default UVM log isn’t so convenient for post-processing.

Of course, UVM has such thing as transaction recording but, as a rule, it is vendor lock-in options. It will be challenging to support all possible simulators.

Seriliazation is a good option for such transaction logs, because it defines a standard data structure, that can be quite easily parsed and processed. There are plenty of different tools and libraries that can work with JSON, YAML and other data formats.

As an example, let’s create a simple sequence, serialize, deserialize and filter transactions.

MessagePack as a serialization format

I decided to use MessagePack for serialization as it has simple and fast implementation of decoder and encoder.

I implemented a small encoder and decoder for my examples.

Simple data transaction

Let’s create a simple sequence item:

class test_seq_item extends uvm_sequence_item;
`uvm_object_utils(test_seq_item)

rand bit[31:0] address;
rand bit[31:0] data[$];
rand bit is_wr;
time timestamp;

constraint addr_c {
address > 32'h1000_0000;
address > 32'h2000_0000;
address % 4 == 0;
}
constraint data_c {
if(is_wr) {
data.size() < 10;
data.size() > 0;
} else {
data.size() == 0;
}
}
function new(string name = "test_seq_item");
super.new(name);
endfunction

function string convert2string();
string op = is_wr ? "Write" : "Read";
string data_str;
foreach(data[i]) begin
data_str = {data_str, $sformatf("%h ", data[i])};
end
return $sformatf("Timestamp: %0t\nAddress: %h\nData: %s\nOperation: %s\n", timestamp, address, data_str, op);
endfunction
endclass

To introduce a serialization/deserialization feature add encoder and decoder to the item:

protected msgpack_enc encoder;
protected msgpack_dec decoder;

function new(string name = "test_seq_item");
super.new(name);
encoder = new("encoder");
decoder = new("decoder");
endfunction

and add proper functions:

function msgpack_buffer serialize();
encoder.write_array(4);
encoder.write_int(address);
encoder.write_array(data.size());
for(int i = 0; i < data.size(); i++) begin
encoder.write_int(data[i]);
end
encoder.write_bool(is_wr);
encoder.write_real(timestamp);
return encoder.get_buffer();
endfunction

function test_seq_item deserialize(msgpack_buffer buffer);
int data_size;
decoder.set_buffer(buffer);
if(decoder.read_array() != 4)
`uvm_fatal(get_name(), "Message must have 4 elements")
address = decoder.read_int();
data_size = decoder.read_array();
repeat(data_size) begin
data.push_back(decoder.read_int());
end
is_wr = decoder.read_bool();
timestamp = decoder.read_real();
endfunction

Sequence and logs

For simplicity, let’s create and serialize our items from a dummy sequence:

class test_seq extends uvm_sequence;
`uvm_object_utils(test_seq)

test_seq_item item;

function new(string name = "test_seq");
super.new(name);
endfunction

task body();
int file = $fopen("./out.dat", "wb");
repeat(5) begin
test_seq_item tr;
msgpack_buffer buffer;

// Create and randomize transaction
tr = new("tr");
tr.randomize();
tr.timestamp = $time;
// Serialize item and write data to a file
buffer = tr.serialize();
print_array({>>8{buffer}});
foreach(buffer[i]) $fwrite(file, "%c", buffer[i]);
end
endtask
endclass

When we run a test, a binary file with our transactions will be generated. If we convert this dump to JSON, we will get the following result:

[
[
1573531672,
[
201028477,
915070561
],
true,
0.0
],
[
2966935196,
[],
false,
10.0
],
[
3156301808,
[
3425409566,
40081362,
3198205084,
548654139,
2588406710,
2606032908,
150183030,
1517756356,
316820484
],
true,
20.0
],
[
3687057508,
[
2752069036,
2114599762,
422656201,
2492519696,
2602410650,
4114404296
],
true,
30.0
],
[
2696827208,
[],
false,
40.0
]
]

If we want to improve readability of the log, we can use map during serialization and store names of fields as keys of the map. To achive it, we need to change our serialization/deserializaiton function in the item:

function msgpack_buffer serialize();
encoder.write_map(4);
encoder.write_string("Address");
encoder.write_int(address);
encoder.write_string("Data");
encoder.write_array(data.size());
for(int i = 0; i < data.size(); i++) begin
encoder.write_int(data[i]);
end
encoder.write_string("Write");
encoder.write_bool(is_wr);
encoder.write_string("Data phase timestamp");
encoder.write_real(data_phase_diff);
return encoder.get_buffer();
endfunction

As a result, we will have the following log:

[
{
"Address": 911572716,
"Data": [
747149543,
791340212,
3048628125,
335249446,
2436943088,
3552743785,
4017646525,
3900302935,
3775544702
],
"Write": true,
"Timestamp": 0.0
},
{
"Address": 2098406844,
"Data": [
2162945095,
4212982404,
2278119411,
3843966079,
3934132232,
575828680,
2804914579,
684893058,
2950337799
],
"Write": true,
"Timestamp": 10.0
},
{
"Address": 1786491752,
"Data": [
3866514721,
396690723,
1487468299,
285314652,
3039926434,
67602783,
1095035777
],
"Write": true,
"Timestamp": 20.0
},
{
"Address": 3639671940,
"Data": [],
"Write": false,
"Timestamp": 30.0
},
{
"Address": 3906959464,
"Data": [],
"Write": false,
"Timestamp": 40.0
}
]

Filter logs

Quite often we want to reduce amount of information in our logs. UVM logs provide some runtime options that help us to change verbosity of different components. However, it isn’t a flexible solution and it’s time-consuming that is extremely important for big testbenches. In addition, it’s difficult to hide a part of UVM message.

Serialized logs, in contrast, can be easily filtered or used for post-processing checks. For example, it takes several lines in python to remove all read operations:

import json

file = open("test.json", "r")
# Transform json input to python objects
input_dict = json.loads(file.read())
# Filter write operation
output_dict = [x for x in input_dict if x['Write'] == True]
# Transform python object back into json
output_json = json.dumps(output_dict)
# Show json
print(output_json)

The following log will be printed:

[
{
"Address": 911572716,
"Data": [
747149543,
791340212,
3048628125,
335249446,
2436943088,
3552743785,
4017646525,
3900302935,
3775544702
],
"Write": true,
"Timestamp": 0.0
},
{
"Address": 2098406844,
"Data": [
2162945095,
4212982404,
2278119411,
3843966079,
3934132232,
575828680,
2804914579,
684893058,
2950337799
],
"Write": true,
"Timestamp": 10.0
},
{
"Address": 1786491752,
"Data": [
3866514721,
396690723,
1487468299,
285314652,
3039926434,
67602783,
1095035777
],
"Write": true,
"Timestamp": 20.0
}
]

Sequence generation

When we have such logs, we can use deserialization to create a sequence with the correspondence data. There are several scenarios when it can be useful for:

  • Adjustment of stimuli. In this case, we don’t need to use additional constraints or create a special direct sequence, we can just reuse the log.
  • Stimuli generation, e.g. it’s time-consuming to create sequences during the runtime.
  • Reusing output of other testbenches as input for ours.

We’ve already implemented deserialization function, our next step will be to create a sequence that will read the log:

class read_seq extends uvm_sequence;
`uvm_object_utils(read_seq)

function new(string name = "read_seq");
super.new(name);
endfunction

task body();
msgpack_buffer buffer;
int file = $fopen("./out.dat", "rb");
forever begin
int symbol = $fgetc(file);
if(symbol != -1) begin
buffer.push_back(symbol);
end else begin
break;
end
end
repeat(5) begin
test_seq_item tr = new("tr");
tr.deserialize(buffer);
buffer = buffer[tr.decoder.state.offset : $];
`uvm_info(get_name(), tr.convert2string(), UVM_NONE);
end
endtask
endclass

As we can see from a simulation log below, all necessary transactions were created after deserialization:

UVM_INFO ./src/test_seq.sv(69) @ 0: reporter@@seq [seq] 
Timestamp: 0
Address: 36557eec
Data: 2c8898e7 2f2ae4b4 b5b65f9d 13fb8026 9140ccf0 d3c29169 ef786bbd e879e657 e10a3d7e
Operation: Write

UVM_INFO ./src/test_seq.sv(69) @ 0: reporter@@seq [seq]
Timestamp: 10
Address: 7d1325bc
Data: 80ebec47 fb1d0284 87c957f3 e51e447f ea7e1808 225272c8 a72f9993 28d2a382 afda9507
Operation: Write

UVM_INFO ./src/test_seq.sv(69) @ 0: reporter@@seq [seq]
Timestamp: 20
Address: 6a7bb368
Data: e6765521 17a50523 58a8f70b 11018e5c b53198a2 0407895f 4144eb81
Operation: Write

UVM_INFO ./src/test_seq.sv(69) @ 0: reporter@@seq [seq]
Timestamp: 30
Address: d8f0fc84
Data:
Operation: Read

UVM_INFO ./src/test_seq.sv(69) @ 0: reporter@@seq [seq]
Timestamp: 40
Address: e8df7868
Data:
Operation: Read

Foreign language communication

The last case that I want to describe is the communication between SystemVerilog and other languages. Currently, we can use DPI-C to send data from one language to another.

Serialization can be a more flexible solution. We can implement simple interface to send messages to another language and get response. In this case, to extend the functionality we need only serialize a new type of data and reuse the same interface without adding additional DPI-C functions.

For our last example, let’s create a small Java model that will communicate with our testbench. The model will receive a command(ADD or SUB) and a value from the testbench and change the internal value of the model.

I won’t describe how to run JVM from the SystemVerilog, it’s not important for our example.

Our test creates a Java environment, then it sends several sets of ADD-SUB commands with random data and gets a current model state. Let’s use the following approach for data structures.

Data structure with operation:

[
$integer_value,
"ADD" or "SUB"
]

Data structure with internal model state:

{
"Result": $integer_value
}

The test:

class java_test extends uvm_test;
`uvm_component_utils(java_test)

function new(string name, uvm_component parent);
super.new(name, parent);
endfunction

// Create message with the operation data structure
function ffi_pkg::msg_t create_msg(int value, string op);
msgpack_buffer buffer;
msgpack_enc enc = new();
enc.write_array(2);
enc.write_int(value);
enc.write_string(op);
buffer = enc.get_buffer();
create_msg = new[buffer.size()];
foreach(buffer[i]) begin
create_msg[i] = buffer[i];
end
endfunction

task run_phase(uvm_phase phase);
string msg;
msgpack_tree tree = new();
java_env j_env = new();
`uvm_info(get_name(), "Get message from Java environment", UVM_NONE)

for(int i = 0; i < 4; i++) begin
msgpack_map_node map;
int result;
j_env.set_msg(create_msg($urandom_range(5, 1), "ADD"));
j_env.set_msg(create_msg($urandom_range(5, 1), "SUB"));
tree.build_tree(j_env.get_msg());

// Extract model state
if(!$cast(map, tree.root)) `uvm_fatal(get_name(), "First node in tree isn't a map")
result = msgpack_int_node::extract_value(map.get_value_of_string("Result"));
`uvm_info(get_name(), $sformatf("Result is %0d", result), UVM_NONE)
end
endtask
endclass

The model:

public class Msg {
int counter;

public void set_msg(byte message[]) {
try{
MessageUnpacker unpacker = MessagePack.newDefaultUnpacker(message);
int length = unpacker.unpackArrayHeader();
int value = unpacker.unpackInt();
String op = unpacker.unpackString();
System.out.format("Operation: %s, value: %d\n", op, value);
switch (op) {
case "ADD" : counter += value;
break;
case "SUB" : counter -= value;
break;
}
unpacker.close();
} catch(IOException e) {
e.printStackTrace();
}
}
public byte[] get_msg() {
try{
MessageBufferPacker packer = MessagePack.newDefaultBufferPacker();
packer
.packMapHeader(1)
.packString("Result")
.packInt(counter);
System.out.format("Return value: %d\n", counter);
packer.close();
return packer.toByteArray();
} catch(IOException e) {
e.printStackTrace();
}
return null;
}
}

The result of the test:

UVM_INFO ./tests/java_test.sv(25) @ 0: uvm_test_top [uvm_test_top] Get message from Java environment
Operation: ADD, value: 1
Operation: SUB, value: 4
Return value: -3

UVM_INFO ./tests/java_test.sv(37) @ 0: uvm_test_top [uvm_test_top] Result is -3
Operation: ADD, value: 3
Operation: SUB, value: 5
Return value: -5

UVM_INFO ./tests/java_test.sv(37) @ 0: uvm_test_top [uvm_test_top] Result is -5
Operation: ADD, value: 1
Operation: SUB, value: 5
Return value: -9
UVM_INFO ./tests/java_test.sv(37) @ 0: uvm_test_top [uvm_test_top] Result is -9

Operation: ADD, value: 4
Operation: SUB, value: 2
Return value: -7
UVM_INFO ./tests/java_test.sv(37) @ 0: uvm_test_top [uvm_test_top] Result is -7

Conclusion

With help of this article, I want to highlight that it’s important for verification engineers to explore the potential of different approaches such as serialization and deserialization, as they can offer significant benefits for testing and verification. By utilizing these techniques, we can improve the quality and efficiency of ours testbenches, making it easier to produce high-quality products in less time.

I hope it can save your time in future ;)

--

--