IRSIM: Dynamic Power Analysis and Other Improvements

13 min readSep 12, 2022

Introduction

Hi! My name is Jason and this summer, I had the opportunity of being a student contributor in Google Summer of Code 2022 for the Free and Open Source and Silicon (FOSSi) Foundation under the mentorship of Tim Edwards.

A. Background

As VLSI processing technologies for manufacturing digital and mixed-signal chips become more complex and high-performing, power analysis becomes a bigger consideration in maintaining efficiency. Proper power analysis requires calculating the average switching power while simulating devices undergoing switching activity as clocks and data path signals toggle between high and low logic values and parts of the circuit are put into specific operating modes, such as low or sleep mode. Static timing analysis can be used to estimate power by characterizing both static and dynamic power using data from Liberty (.lib) files. Dynamic power is then measured between timing arcs, which are the I/O pins between blocks/components in a digital circuit. This is only used for power estimation however, and it does not provide insight as to how the circuit actually behaves, such as simulating the circuit with regions of low power for conservation. SPICE simulations can do this, but are compute intensive, while Verilog simulations are fast but cannot perform power analysis.

IRSIM, an open-source switch-level simulator which simulates circuits at the device (transistor) level, uses a linear switch-level model of simulation whereby every transistor can be modeled as a resistive switch. Hence, every transistor can be characterized in either an “ON” or an “OFF” state and as simulation advances, state changes are scheduled as events in a queue, similar to the way Verilog simulations behave. The performance of IRSIM is much better than SPICE simulations, but because it is primarily used as a logic analyzer, it has fallen out of use to Verilog simulations, and the power analysis code has not been exercised in a while.

B. The Project

My project involved improving the existing dynamic power analysis code in IRSIM. This included meeting the following goals:

Simulating circuits using the power analysis code: the power analysis code had not been used in a while so I needed to make sure it worked properly
Histogram generation: this includes a newly defined Tcl command line interface for power visualization during simulation. Ultimately, users would be able to keep track of power per specified time period (normally the clock period), so that they can verify the power of the circuit or check that the circuit is functioning properly. For instance, a circuit with multiple toggle switches operating on a single power source should exhibit a Gaussian power distribution as we expect transistors to toggle state about 50% of the time, meaning that power frequency should be concentrated around the average of the minimum and maximum observed power measurements. Likewise, a circuit with one extremely high voltage and one low voltage with approximately the same toggle activity should expect to see power distributions concentrated at two peaks.
Multiple device support: the current version of IRSIM only supports two different device parameters: n-channel and p-channel. The current .sim files that IRSIM parses only recognizes ’n’ and ‘p’ for the device syntax but it also must support ‘x’ types for custom subcircuits consisting of transistors, resistors, and capacitors. Another format of .prm files is to be adopted which requires each FET device to have a resistance context associated with it. The goal of this is to allow IRSIM to capture separate properties of different device types.
Multiple power domain support: as stated in the background, effective power analysis should provide insight on power mitigation techniques, such as areas of the circuit that should be put into low power operations while maintaining performance. This means instead of relying on a single voltage source for the entire circuit, there may be multiple voltage sources so that not all transistors will operate on a single source.

C. Project Goals

The first step was mainly research and familiarizing myself with the IRSIM repository. Once I had the codebase set up, the first task was rather trivial; to ensure that the power analysis code was functional as it had not been exercised in a while and write some Tcl scripts to extract the power after a specified simulation time and generate histogram data for visualization as well as define a Tcl command line interface for this function. Using the first repository link referenced at the bottom of this page, we simulated a digital frequency locked loop circuit, a high frequency clock generator for the Caravel SoC design using the Skywater 130nm process shown below.

;
; configuration file for sky130 (0.13um process)
;
lambda  0.005 ; length scaling, microns (1 lambda = 1 centimicron)
capga   .00832 ; gate capacitance, pF/micron^2
capda 0.0012
capdp 0.0013
cappda 0.00260
cappdp 0.00090
lowthresh  0.5  ; logic low threshold as a normalized voltage
highthresh 0.5  ; logic high threshold as a normalized voltage
cntpullup 0     ; irrelevant, cmos technology; no depletion transistors
diffperim 0     ; don't include diffusion perimeters for sidewall cap.
subparea 0      ; poly over transistor won't count as part pf bulk-poly cap.
diffext  0      ; diffusion extension for each transistordevice nfet sky130_fd_pr__nfet_01v8
device nfet sky130_fd_pr__nfet_01v8_lvt
device nfet sky130_fd_pr__nfet_g5v0d10v5
device pfet sky130_fd_pr__pfet_01v8
device pfet sky130_fd_pr__pfet_01v8_lvt
device pfet sky130_fd_pr__pfet_01v8_mvt
device pfet sky130_fd_pr__pfet_01v8_hvt
device pfet sky130_fd_pr__pfet_g5v0d10v5; Capacitor values are in pF/centimicron^2
device capacitor sky130_fd_pr__cap_mim_m3_1 2.0E-7
device capacitor sky130_fd_pr__cap_mim_m3_2 2.0E-7; Resistor values are in ohms/square
device resistor sky130_res_high_po_0p35   320
device resistor sky130_res_high_po_0p69   320
device resistor sky130_res_high_po_1p41   320
device resistor sky130_res_high_po_2p85   320
device resistor sky130_res_high_po_5p73   320
device resistor sky130_res_xhigh_po_0p35 2000
device resistor sky130_res_xhigh_po_0p69 2000
device resistor sky130_res_xhigh_po_1p41 2000
device resistor sky130_res_xhigh_po_2p85 2000
device resistor sky130_res_xhigh_po_5p73 2000
device resistor sky130_res_generic_nd     120
device resistor sky130_res_generic_pd     197
resistance n-channel dynamic-high   0.8 0.2 18174.0
resistance n-channel dynamic-low        0.8 0.2 3195.0
resistance n-channel static     0.8 0.2 3335.0
resistance p-channel dynamic-high   1.0 0.2 7482.0
resistance p-channel dynamic-low        1.0 0.2 204714.0
resistance p-channel static     1.0 0.2 4204.0

Then, the following Tcl commands were used to get the waveform image:

# IRSIM stimulus for digital_pll
#
# I/O for the digital_pll from the verilog source:
#
#   input        resetb;        // Sense negative reset
#   input        enable;        // Enable PLL
#   input        osc;           // Input oscillator to match
#   input [4:0]  div;           // PLL feedback division ratio
#   input        dco;           // Run in DCO mode
#   input [25:0] ext_trim;      // External trim for DCO mode
#
#   output [1:0] clockp;        // Two 90 degree clock phases# Set power supplies
l VGND
h VPWR# Define signal vectors
vector div div\[4:0\]
vector ext_trim ext_trim\[25:0\]
vector clockp clockp\[1:0\]# Watch these signals
analyzer
ana osc div clockp resetb enable# Initial values
setvector div 0d8
setvector ext_trim 0d9999
h dco
l enable
l resetb
l osc# To be done:  Figure out why the zero resistor isn't seen on
# the constant high/low outputs.  For now, force the values
h ringosc.iss.const1/HI
l ringosc.iss.const1/LO# Startup sequence
s 500
h resetb
s 500
h enable
s 500# Define external clock
every 1000 {toggle osc}

Then, the following console output is a result of trying out the existing power analysis code (after starting up IRSIM).

Read digital_pll.sim lambda:0.01u format:MIT
1381 nodes, 3550 aliases; transistors: n-channel=1930 p-channel=1953 resistor=2 shorted=194
parallel txtors: n-channel=868 p-channel=920
time = 500.000ns
time = 1000.000ns
time = 1500.000ns; there are 4 pending events
Main console display active (Tcl8.6.3 / Tk8.6.3)
% powlogfile /dev/null
% powtrace osc resetb enable clockp\[0\] clockp\[1\]
% powstep
Power display enabled
% vsupply 1.8
Supply Voltage = 1.80 Volts
% sumcap
Sum of nodal capacitances: 2370.938965 pF 
% s 10
time = 1510.000ns; there are 4 pending events
Dynamic power estimate for powtrace'd nodes on last step = 0.005923 mW
% s 10000
time = 11510.000ns; there are 3 pending events
Dynamic power estimate for powtrace'd nodes on last step = 0.004062 mW

The power analysis code works as intended and the power measurement is reasonable for this small circuit. The next steps were to generate histogram data and allow for multiple power domains / device support.

D. Algorithm Implementation

Histogramming

The storage of the histogram was primarily derived from the doactivity() function, which was used to report circuit activity given a time interval. These set of commands are intended to capture power data after simulation based on a specified time period, such as the clock period. Upon initializing the power histogram, a struct array is dynamically allocated with the specified number of bins, consisting of a double defining the lower power range and an integer defining the number of measurements in this range. Since histogram capturing is performed during circuit simulation when time advances and the method dostep(), which is used to perform this, uses local variables to accomplish this task, we need a way to use these variables to calculate the power as time advances. Hence, we copied these variables into the histogram function and used the same formula to estimate the power there. Then, while the power histogram is simulating, “standard” simulation using the command

s <time>

must not be performed in order to avoid duplication of the simulation.

Multiple Device Types

While reading the .prm file for device lookup and storage, each device should normally be associated with a resistance context; however, the current .prm files do not currently support this feature. First, each time a new device is encountered, the device is stored in an array upon calling the program for the first time. The transistor flags are then modified to accommodate the number of transistor types that may be available, which is no longer a fixed value. Moreover, because each device must have a resistance value associated with it (which is stored in a linked list containing a linked list), a routine must be created to copy resistances from the default resistance types, initialized to be n-channel and p-channel resistances with a value of 0. That way, the code will not result in an invalid memory access when accessing the associated transistor type in the array of device types.

Multiple Voltage Domains

Since we are analyzing circuits with multiple voltage domains, we must be storing a list of power net names indicating a list of voltage sources in the design. The power command was modified so that whenever it was called with a node argument, the node was added to the list of power sources. Then, each node (represented in the code as a struct pointer), was modified to take in a vsupply squared value, which is one of the terms used in the power estimation formula. The idea behind this is that as simulation time advances, instead of accumulating step capacitance, the program will accumulate total energy as a product of this vsupply squared value and the step capacitance for each node in the design. Then, when the actual dostep() function is called for simulation, the program calls a recursive walk_path() function, which follows the current path given a voltage supply and a node to search from by following transistor fanout. Each time a node is visited, it is then marked as visited (using bit flags) and then checks the next node (which is accessed via a linked list implementation). If the connected transistor is from a source terminal, walk the path from the drain and vice versa. The visited flags are then cleared for the next simulation step.

E. Implementation Results

These commands used C for the implementation and a Tcl command line interface for the user to simulate the circuit. The following command line interface defines a general definitions for histogramming:

# initialize the histogram with the minimum and maximum power range # and optionally the number of buckets, 20 default
powhist init [min] [max] <buckets># if the histogram structure has been initialized, use the power 
# estimation formula to record the power and store it in the 
# histogram, otherwise, throw an error
powhist capture# clear the histogram data and free the memory
powhist reset# print captured histogram data stored
powhist print# Example# Commands to set up circuit (voltage, power supply, etc.)...
powlogfile /dev/null
powtrace *
powstep
powhist init 0 1# capture data every 10 ns
every 10 {powhist capture}# simulates time, capturing data from above command
s 10000 powhist print
powhist reset

Using this script format in the frequency locked loop example yields the following output:

which when graphed in MATLAB, displays the following:

This graph is not perfectly normal (it may even be intended to show that the circuit toggles mainly between two modes of power, indicating a bimodal distribution).

For multiple device support, the code operates on the scmos100.prm file, which contains no devices. With the Skywater 130nm process using the same frequency locked loop example, we observe the following output at start up:

Warning: missing required resistance for device sky130_fd_pr__diode_pw2nd_05v5
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__nfet_01v8
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__nfet_01v8_lvt
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__nfet_g5v0d10v5
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__pfet_01v8
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__pfet_01v8_lvt
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__pfet_01v8_mvt
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__pfet_01v8_hvt
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__pfet_g5v0d10v5
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__cap_mim_m3_1
Setting default resistance value
Warning: missing required resistance for device sky130_fd_pr__cap_mim_m3_2
Setting default resistance value
Warning: missing required resistance for device sky130_res_high_po_0p35
Setting default resistance value
Warning: missing required resistance for device sky130_res_high_po_0p69
Setting default resistance value
Warning: missing required resistance for device sky130_res_high_po_1p41
Setting default resistance value
Warning: missing required resistance for device sky130_res_high_po_2p85
Setting default resistance value
Warning: missing required resistance for device sky130_res_high_po_5p73
Setting default resistance value
Warning: missing required resistance for device sky130_res_xhigh_po_0p35
Setting default resistance value
Warning: missing required resistance for device sky130_res_xhigh_po_0p69
Setting default resistance value
Warning: missing required resistance for device sky130_res_xhigh_po_1p41
Setting default resistance value
Warning: missing required resistance for device sky130_res_xhigh_po_2p85
Setting default resistance value
Warning: missing required resistance for device sky130_res_xhigh_po_5p73
Setting default resistance value
Warning: missing required resistance for device sky130_res_generic_nd
Setting default resistance value
Warning: missing required resistance for device sky130_res_generic_pd
Setting default resistance value
Using default name "Vdd" for power net.
Using default name "Gnd" for ground net.
(files/digital_pll.sim,1): WARNING: sim file lambda (5000) != config lambda (0.005)
(files/digital_pll.sim,1): WARNING: Using the config lambda (0.005)
files/digital_pll.sim: Ignoring lumped-resistance ('R' construct)Read files/digital_pll.sim lambda:0.01u format:MIT
1381 nodes, 3550 aliases; transistors: n-channel=1930 p-channel=1953 resistor=2 shorted=194
parallel txtors: n-channel=868 p-channel=920
time = 500.000ns
time = 1000.000ns
time = 1500.000ns; there are 4 pending events
Main console display active (Tcl8.6.12 / Tk8.6.12)

The first several warnings indicate that the IRSIM parser recognizes that some devices do not have an associated resistance and manually set the default resistance for ensuring backwards compatibility.

Finally, for multiple voltage domain support, the vsupply command was modified to take in an additional node name argument:

vsupply <node> <voltage>

indicating that the node is connected to the specified voltage value.

The following command line interface, along with an example script and output using the same digital locked loop circuit, defines the new power estimation procedure assuming a circuit that is to be analyzed with multiple power domains:

# declare all of the power node names
power <net name>
...# declare the ground node
ground <ground name># initialize logic high nodes and logic low nodes
h <node>
...
l <node>
...# optionally, toggle signals
every <time> {toggle net}# set voltage values
vsupply <power net> <voltage value># Example
power VPWR
ground VGND
h VPWR
l VGND
vsupply VPWR 1.8

The following output is observed for a 10,000 ns simulation:

Supply Voltage = 1.80 Volts at VPWR
Power display enabled
time = 11500.000ns; there are 5 pending events
Dynamic power estimate for powtrace'd nodes on last step = 0.847443 mW

Another example (which could not have been done using the current IRSIM version), is a transistor design of two inverters each using its own power source. This uses the scmos100.prm file, which is found in the /lib/prm directory of the repository. The following netlist describes the circuit:

p in1 vdd1 out1 2 4
p in2 vdd2 out2 2 4
n in1 gnd out1 2 4
n in1 gnd out2 2 4

along with the script used to generate a simulation via histogram:

power vdd1
power vdd2
h vdd1
h vdd2
ground gnd
l gnd
l in1
l in2
every 10 {toggle in1; toggle in2}
vsupply vdd1 1
vsupply vdd2 2powlogfile /dev/null
powtrace *
powstep
powhist init 0 0.1 10
every 10 {powhist capture}
s 300000
powhist print

The resulting output is then shown:

The implemented code was tested primarily on smaller sized circuits, mainly serving as a sanity check. It should be necessary to test this on larger digital designs (CPUs, microprocessors, datapaths, etc.) with multiple transistors operating with multiple power sources and different .prm files to ensure the code works as intended. In addition, SRAM netlists typically fail in IRSIM due to the way delay calculation algorithms are implemented. IRSIM first splits the circuit into channel connected regions (CCRs), which are groups of nodes mutually reachable through source-drain connections. Then any change in potential (logic levels) in any of the nodes in the CCR causes all of the nodes in the CCR to be re-evaluated. Transition delays for when the node logic changes is evaluated using a depth-first search algorithm on the CCR graph to find the effective resistance and capacitance from a node to its respective high and low logic power supplies and the delay is computed as an RC time constant. This would work if there are no resistance loops in the graph, which is not the case in an SRAM circuit. It instead uses a back-to-back inverter circuit, which is driven by driving logic values on both sides (standard flip-flop circuit). This is a bit complex and not related to the power analysis code and due to time limitations, was not implemented but it is a good next step towards improving IRSIM’s circuit simulation capabilities. There was also an idea to implement static timing analysis by allowing the user to load in a liberty (.lib) file so that power analysis can be computed in a different manner.

The links below contain information on the work done this summer.

Original IRSIM version: https://github.com/RTimothyEdwards/irsim/tree/irsim-9.7

Pull Request: https://github.com/RTimothyEdwards/irsim/pull/5

References:

Tagotho Rai Dastidar, Partha Ray, “A New Device Level Digital Simulator for Simulation and Functional Verification of Large Semiconductor Memories”, IEEE