Visualizing Neural Network Python — Story 2

John "Jake" Baumgarten
10 min readJun 13, 2024

--

Place-transition diagram for Python configuring neural network training

Story 2 — Software Graphical Idioms for Deep Learning

Here we continue our exploration of place-transition programming we introduced in story [S1]. We’ll cover place-transition programming basics, then dive into details of the Coordinator sub-capsules in the middle left of the diagram above.

2.1 Networks of Graphs

In object-oriented programming (OOP) we think of objects as schemas embedded in data instances of classes. We bind all instances of a class to a set of class-wide methods, which are functions with these data instances as implicit arguments. Java and Python folks would call these “instance methods”. Go folks define a method as a “a function with a special receiver argument”.

In functional-programming (FP) we focus on immutable data, function composition, internal iteration, pattern matching, and language comprehensions.

For Python software configuring a neural network, we have proposed different programming thought and focus — data-functions-objects marking places connected via transitions. Within directed (mostly acyclic) graphs of places and transitions we identify idioms. These idioms play roles in patterns. We can consider this progression to be a select blending of OOP and FP characteristics particularly tailored to configuring a neural network.

Bottom-up versus Top-down — Forking and Joining

For a data scientist, a Jupyter notebook allows a bottom up approach to constructing a “sequence” of cells that ultimately combine to train, verify, and test a neural network. The sequence is somewhat misleading as it actually represents an upward traversal of a tree where subsequent cells join results from previous ones.

But cloud engineers are used to a top-down approach to data processing where the “entry point” forks and then, via stack mechanisms (or language facilities like “futures” or “channels” if multi-threaded), the resulting software network joins to one or more “exit points”. The exit points usually output to messaging queues or generate a synchronous response. Persistence or calling collaborating services are often at the “inflection points” where forking turns to joining.

So data scientists and cloud engineers are used to a different topological approach to software construction. Also Jupyter networks allow data scientists to annotate and “qualify” their software by embedding notes, diagrams, and charts. Conversely cloud engineers “double helix” their production software to a large parallel “shadow network” of unit, module, integration, and end-to-end tests.

Much of this divergence is due to different missions for cloud software and neural network training. Cloud software has to support and survive in a production deployment. Data scientists are performing experiments, the result of their work may be deployed into production (perhaps by cloud engineers!).

I’m not advocating data scientists adopt cloud engineering’s approach to software. Each approach has been “crowd sourced” to best fit the various technical missions. But for cloud engineers with years (or even decades!) of experience with an alternative topological approach, the use of notebooks can be jarring and disorienting. Especially if these cloud engineers didn’t spend extensive time in a REPL-oriented (read-eval-print loop) language (like Python).

Functional Capsules

There’s a certain “base” idiom for place-transition coding that I should describe before we look at the code. We often use capsules in an analogous way to functions with closure (constructors provide the “closure”), and these capsules have only a single “non-private” method. (“Private” here being convention, not necessarily language enforcement.) For example a capsule might be named “Move”. Then the non-private method would just be named “perform”. We call capsules that conform to this base idiom “functional capsules”.

Let’s revisit this top-level place-transition diagram we are working through:

Here we see that the primary entry points to our capsules are either constructors — black new boxes which correspond to Python __init__() methods, or the (usually) single perform() method. This particular diagram has two deviations from the new-perform “rule”:

  • The LTV ModelBuilder lower sub-capsule, in the middle-right of the diagram, has two perform transitions, since it embodies the elemental design pattern characteristic of redirection, rather the more common one of delegation. Smith [2A] (see references below) covers this distinction with clarity and detail.
  • The Coordinator lower sub-capsule, in the middle left of the diagram, shows input from outside the sub-capsule to the private visualize transition. This is a result of the stack “unwinding” after the select_and_train transition in the upper Coordinator sub-capsule. So this is really not an entry point into the Coordinator. Peaking ahead at the code in Coordinator’s perform() method, we can see this directly below.
    def perform(self, modification: str) -> None:
model_obj, dataframe = self.select_and_train()
self.visualize(modification, model_obj, dataframe)

2.2 Place-Transition Code

Let’s look at the entire code for the Coordinator sub-capsules from the diagram above:

from matplotlib import pyplot as plt
from pandas import DataFrame
from torch.nn import Sequential

from deep.six.LTV import ModelBuilder, Visualizer


class Coordinator:

FC = 'FC'
CNN = 'CNN'

def __init__(self, model_desc: str):

self.model_desc = model_desc
self.model_builder = ModelBuilder()

def to_label(self, modification: str) -> str:
label = self.model_desc
if modification is not None:
label += ' - ' + modification
return label

def visualize(self, modification: str,
model_obj: Sequential, dataframe: DataFrame) -> None:
del model_obj
Visualizer(self.to_label(modification), dataframe).perform()

def select_and_train(self) -> (Sequential, DataFrame):
match self.model_desc:
case Coordinator.FC:
return self.model_builder.perform_fc()
case Coordinator.CNN:
return self.model_builder.perform_cnn()

def perform(self, modification: str) -> None:
model_obj, dataframe = self.select_and_train()
self.visualize(modification, model_obj, dataframe)


def run(model_desc: str, modification: str) -> None:

coordinator = Coordinator(model_desc)
coordinator.perform(modification)
plt.show()


run(Coordinator.CNN, "Residual")

This code is refactored from

I’ve extracted Raff’s code from his Jupyter notebook and reworked to support representation in a place-transition graph. There’s nothing magical or brilliant in my reworking, other than it allows us to draw graphs such as this:

Let’s walk through the code and the graphs in “lock step” to show how we can move from code to the graphs. Once we learn the basics of place-transition coding and graphing, we can move from graph modifications to code modifications. In writing this story I often moved back and forth as coding clarity and graphical clarity inspired each other.

Functions/Methods and Transitions

While methods (and functions) have a strong relationship with transitions, the transitions we label with method names usually indicate the declaration of a method; or more exactly, the “marking” of required input places. The “return” of the method is “downstream” of this declaration marking, where the resulting output places are located.

Our code is focused on a stack-heap-based execution machine, but place-transition diagrams represent the code as a predecessor-successor network of marked places. The effect of stack threads is represented by joins either within the capsule graphs or in the network they are embedded in.

After years of training our brains to think like stack machines, place-transition oriented coding can take a while to “grok”. This is similar to the path many of us practitioners took when functional techniques started “invading” our object-oriented languages.

The “new” Transition

Capsules (which are represented as classes in a language such as Python) can “encapsulate” functions (such as @staticmethod or close “relatives” such as @classmethod), but usually encapsulate object instance methods which have a self first argument. The new transition however is somewhat unique. We draw it straddling the capsule border and it performs these tasks:

  • As within any transition, it defines which input place markings are required for the transition to “fire”.
  • The firing of the transition creates an instance of the corresponding class
  • The input place markings (arguments of __init__()) are “injected” into the capsule. These input places can be retained as capsule places via direct or indirect assignment, such as
self.x = x
self.y = f(x)
  • These capsule injected places are thus available as implicit arguments for any instance methods.
  • Outside the capsule, new returns a place with an initialized instance of the corresponding class.

In the top sub-capsule graph above for the Coordinator class, this standard new script leads to the injection of the string model_desc and, after calling another new, an instance of ModelBuilder. Here’s the corresponding __init__() method code:

    def __init__(self, model_desc: str):

self.model_desc = model_desc
self.model_builder = ModelBuilder()

The “perform” Transition

Other than new, a capsule often has only one entry transition which is idiomatically called perform. In order to call any method associated transition from outside a capsule we need an instance of the owning class, thus we see the place marked Coordinator (returned from the new transition above) input to Coordinator’s perform. Also input to Coordinator’s perform is a place marked with a modification string.

(The modification string actually inputs to a “repeater pole” extending from the perform transition. But ignore that for a moment.)

Coordinator’s perform transition has two output places:

  • The first output place is an empty unmarked place that inputs to the private transition select_and_train. The output place is empty, since select_and_train can use the self pointer to access sub-capsule injections, here the model_desc string and the ModelBuilder.
  • The second output place is a repetition of the input place marked modification via the repeater pole. The repeated modification string exits this sub-capsule and inputs to the second sub-capsule of Coordinator which we’ll study below.

We use a repeater pole for a transition when we want to emphasize that some inputs take a very different path when output from the transition.

Here’s the corresponding perform() method code:

    def perform(self, modification: str) -> None:
model_obj, dataframe = self.select_and_train()
self.visualize(modification, model_obj, dataframe)

Code for methods is input (arguments) and output (return values) focused. But our place-transition diagrams only represent methods as declaration transitions showing required input places. The return values are “downstream”. This lets us see the structure of our capsulized network, without having to mentally construct it in our brains as we do when reading code. Again, this takes some getting used to. But now I think in terms of place-transitions, and have to mentally map back to code.

Note perform() in code presents a very “tight” binding of the calling of select_and_train() and visualize(). Whereas in our place-transition diagram, the select_and_train and visualize transitions are not adjacent (several intervening nodes and arcs), and are even in different Coordinator sub-capsules! Place-transition diagrams show us flows that are difficult to tease out without a deep reading of the code that holds stack processing in our mental models.

What connects the two lines of code in perform() is implicit use of the stack. Place-transition networks are explicit. Thus in the diagram, there is no direct connection between select_and_train() and visualize(). After traversing most of the rest of this top-level network, the tuple (torch.nn.Sequential, pandas.Dataframe) symbolized in the code as (model_obj, dataframe) is shown as an input place to the visualize transition in Coordinator’s lower sub-capsule.

Private Method Transitions

Let’s look at each Coordinator sub-capsule one at a time for clarity. Here’s the upper sub-capsule:

Our Coordinator has one private method declaration transition in this upper sub-capsule:

  • select_and_train — inputs an empty place for sequencing, a sub-capsule injected model_desc string, and a sub-capsule injected ModelBuilder. It outputs a tuple of its two non-empty inputs. We draw this output tuple with dashed lines to show it’s implicit in the code, provided by the language’s stack and/or heap mechanisms.

Here’s the code for select_and_train:

def select_and_train(self) -> (Sequential, DataFrame):
match self.model_desc:
case Coordinator.FC:
return self.model_builder.perform_fc()
case Coordinator.CNN:
return self.model_builder.perform_cnn()

Here’s Coordinator’s bottom sub-capsule:

Here are the two private method declaration transitions in this bottom sub-capsule:

  • visualize — is called after all the training is done. In addition to the modification string from perform, visualize inputs a tuple of two “large” objects — a torch.nn.Sequential and a pandas.DataFrame. visualize outputs a modification string place to the private method transition to_label, splits the torch.nn.Sequential out then calls del on it (to release memory), and splits the pandas.DataFrame out to input to a downstream new for the Visualizer capsule/class.
  • to_label — inputs the modification string and the injected model_desc, and outputs a label string.

Here’s the code for the corresponding methods:

def to_label(self, modification: str) -> str:
label = self.model_desc
if modification is not None:
label += ' - ' + modification
return label

def visualize(self, modification: str,
model_obj: Sequential, dataframe: DataFrame) -> None:
del model_obj
Visualizer(self.to_label(modification), dataframe).perform()

Non-Method Transition

Let’s focus again on Coordinator’s upper sub-capsule:

Our Coordinator has one non-method transition which occurs inside the select_and_train() method. This transition is indicated with a “diamond” symbol ⬦:

  • The diamond symbol indicates a redirection transition, not a delegation. Redirection transitions act the same for inputs as most transitions — requiring all input places before firing, but only one of a redirection’s output arcs is followed. Redirection transitions indicate the usage of language flow control mechanisms such as if-else and match-case. Here a match-case is utilized to determine if a fully connected (FC) or convolutional neural network (CNN) is being trained.

We showed the code for the match-case above, but repeat it here for clarity:

def select_and_train(self) -> (Sequential, DataFrame):
match self.model_desc:
case Coordinator.FC:
return self.model_builder.perform_fc()
case Coordinator.CNN:
return self.model_builder.perform_cnn()

2.3 Story Summary

As practicing programmers we’ve been indoctrinated to think like an execution machine that is stack and heap oriented. With our code now configuring neural networks, it’s time to “think different”. Such code is best understood as network of place-transitions graphs. We’ve had a sample of what place-transition programming looks like with the Coordinator’s sub-capsules. In the next story, we look at the network “surrounding” our capsule graphs in this top level place-transition network.

Additional Resources

--

--

John "Jake" Baumgarten

45 years of software development, the last 18 at Apple. Currently researching the relationship between software, dynamical systems, and neural networks.