Visualizing Neural Network Python — Story 1

John "Jake" Baumgarten
6 min readJun 13, 2024

--

What does Python neural network configuration code look like? Like this 👆🏽

Story 1 — Python Networks Configuring Neural Networks

This is the first story in a series looking at the structure of Python code configuring neural networks. The target audience is cloud engineers trying to understand what their data science colleagues are up to.

1.1 Coding for Deep Learning and Graph Neural Networks

I’m working through two very excellent books:

But after a career in C, C++, Java; and dabbling in Scala and Go — I’m having a hard time seeing the structure of the Python code in these author’s Jupyter notebooks. I’ve been developing some code re-organization techniques that enable visualizing the code as a network of graphs. I thought this visualization would help other cloud engineers who are doing a first exploration of the neural network and deep learning world.

Loose Coupling and High Cohesion

Graphical analysis of code has been part of the software engineering profession almost from inception. The graphical traits of “loose coupling” and “high cohesion” are software attributes championed as early as 1970 in the works of authors like Edward Yourdon and Larry Constantine. The following diagram is from a Wikipedia article:

Interestingly, even if we write our software like (a), it effectively operates like (b). While this is true of cloud software, this counter intuitive trait is even more pronounced in the Python code that configures the training, validation, and testing of deep learning and graph neural networks.

Places and Transitions

The traditional building blocks of coding are data, functions, and combinations of the two called objects. In moving to a graphical viewpoint, we think of instances of data-functions-objects as markings occupying a place. To move from place to place we need a transition. We connect places and transitions via arcs, directed edges on a place-transition graph. Markings in a place can only proceed to another set of markings in a different place via a transition. In graph theory such a graph is called bipartite.

Our Python configuration graphs of places and transitions are modeled after Petri nets (see Reisig [3A] in References below), which use ovals for places and rectangles for transitions. Here’s an example place-transition graph that can train a fully connected (FC) or convolutional neural network (CNN) that we’ll explore in a future story (this is detail from the network at the top of this story):

We view code execution as routing through a graph of these places and transitions. Note that any given place can be replaced by a subnet of places and transitions that begin with an input place and end with an output place. Similarly, any transition can be replaced by a subnet that begins and ends with transitions. Thus place-transition graphs have the “fractal” characteristic of looking the same no matter how low we drill into software.

Since our code is execution oriented, structured as data-functions-objects using threads, stacks, and the heap; it can often be hard to recognize places and transitions. A function does not always have a one-to-one relationship with a transition. There might be multiple function calls in a transition or a single function might include multiple transitions. A place can be an implicit tuple of data-functions-objects that doesn’t have an explicit representation in our code. Adding in the movement of data from CPU to GPU and back adds even more variation in the relationship of functions to transitions.

Capsules and Idioms

When we draw place-transition diagrams, we place most of the places and transitions inside “capsules” which correspond to Python’s concept of a “class”. We can break capsules up into sub-capsules when non-contiguous boundaries add more clarity. In our top level network at the opening of this story, the Coordinator and LTV ModelBuilder each have two sub-capsules to emphasize transverse and parallel flow, respectively.

This drawing of capsule boundaries is more art, experience, and intuition than algorithmic. A well drawn place-transition graph conveys the dynamics of graph “memory and anticipation”, concepts developed with rigor in the mathematical discipline of symbolic dynamics (see Lind & Marcus [3B] in References below).

In the resulting representations of capsulized software networks, we begin to see “coding idioms”.

1.2 Place-Transition Networks

The techniques presented in this series of stories allow “refactoring” Jupyter Notebook code into traditional Python packages with classes, and then drawing place-transition graphs and networks to discover the underlying idioms and patterns in the code. We will demonstrate this for fully connected (FC), convolutional (CNN), and (much later in the series) graphical neural networks (GNN).

Repeating the network diagram at the top of this story for quick reference in the comments directly below:

The diagram above represents 117 lines of Python code. While I’ve drawn dozens and dozens of these place-transition diagrams over the past two years, I’m always amazed how much structure even two pages of code can have.

Many of the above capsules labels are prefixed with the abbreviation “LTV”, which stands for loader, trainer, visualizer — a triumvirate that reoccurs often in machine learning. Here you can see idioms labeled with the roles:

  • Coordinator — orchestrates the overall flow
  • ModelBuilder — builds either a FC or CNN model
  • TrainingConfig — sets all the “knobs and switches” for training and actually performs it
  • Loader — streams in data for GPU processing
  • Visualizer — plots how well various epochs are learning

Later in this series of stories, we’ll go over this network in detail and present additional sub-networks for the “cloud” places on the right side that provide lower level sourcing of labeled data, PyTorch configured models, and the epoch tracker for executing the training.

1.3 Story Summary

Place-Transition networks can be particularly helpful in understanding the structure of Python software configuring the training of neural networks; which can be opaque due to the nature of operations like

  • Large scale data processing
  • Many epochs
  • Phasing from training to validating to testing
  • Moving from CPU to GPU and back

This application of programming has many very large scale mutations, which are difficult to represent by traditional object-oriented static typing or classic functional programming techniques. Place-transition networks can add considerable clarity and insight into understanding Python software configuring neural networks.

This first story was a quick visual introduction to place-transition networks. The foundation of this collection of stories is thinking of Python neural network configuration code as a network of graphs whose place nodes are data-functions-objects markings that are connected via transition nodes.

My next post will dive deeper into the relationship between this Python code and place-transition networks, and we’ll actually look at some (admittedly trivial) code.

Additional References

Unfortunately, I haven’t found great online resources for a brief introduction to Petri nets or symbolic dynamics that would appeal to impatient programmers (like myself). But the introductory chapters in these books are great. We only use these theories for inspiration anyway, so don’t worry about diving deeper. But I wanted readers to know that I did do two years of research (on these and many other topics related to the theory and practice of programming) before writing this series.

--

--

John "Jake" Baumgarten

45 years of software development, the last 18 at Apple. Currently researching the relationship between software, dynamical systems, and neural networks.