LampSort Revisited, Visualised

Combining Object Logging & Agile Visualisation


To understand or study an algorithm, a piece of code or an application you start by reading the code. Many developers add logging to code as a way to look at what is happening. In a live environment like Pharo you can also use the debugger to step through code.

Traditional logging is either too coarse and misses information, or too detailed and floods the reader with information. In both cases the output consists of dead strings that you cannot look into or that you cannot ask any more questions.

An excerpt from system.log — Note the mix of formats

The debugger is very detailed, but only for what is on the stack at the position where you are halted. It can only move forward in time in small steps. Often you step too far or you seem to be clicking like crazy.

The Pharo debugger

This article suggests another approach. By combining object logging, agile visualisation and advanced tools, you can get amazing insight into your code, easily.

LampSort

As an example, we’ll be using LampSort, a non-recursive QuickSort implementation. Please follow the link to read up about this cool variant.

In what follows, we are interested in visualising a sorting algorithm and simplify code to that end. Try to image how you can go from here to a more complex, real world application.

Object Logging

Unlike traditional logging where you generate strings, object logging is a technique where we reify what we want to log into a real object. Think of this object as representing the event that happened.

It is up to you to decide what to log and which information to include. There are no rules. Don’t be cheap, it is hard to predict what will be needed later on. What is not included is lost forever.

In real applications you will probably already have most of the data that you want to include at hand, so the cost is acceptable. Just be careful with structure sharing and destructive modifications.

Once the log object is created, you should send it off to whoever is interested, if anyone. It turns out we already have a pretty good mechanism for doing that in Pharo, namely Announcements. We might as well use that to emit log event objects.

LampSortLogEvent

We begin by setting up a class hierarchy of log objects. Here is our root class, LampSortLogEvent.

Announcement subclass: #LampSortLogEvent
instanceVariableNames: ‘data’
classVariableNames: ‘LampSortLogEventAnnouncer’
category: ‘LampSortVisualized-LogEvents’

Each log event will hold a copy of the data array that we are sorting, at the moment the log event was emitted.

data: sequenceableCollection
data := sequenceableCollection copy

On our class side we’ll maintain a unique instance of an Announcer, the one we’ll be using to notify interested consumers of log events that are emitted.

announcer
^ self class announcer
emit
self announcer announce: self

And on the class side we have a lazy initialisation for the Announcer.

announcer
^ LampSortLogEventAnnouncer
ifNil: [
LampSortLogEventAnnouncer := Announcer new ]

Here is one of the concrete log event subclasses, LampSortPivotSelected. It feels natural to name these using past tense. This particular event is emitted when the algorithm selects a pivot from a sub interval of the data, which is always the first element of the interval is this particular version of the algorithm.

LampSortLogEvent subclass: #LampSortPivotSelected
instanceVariableNames: ‘interval’
classVariableNames: ‘’
category: ‘LampSortVisualized-LogEvents’

It is important to give each log event a good printOn: implementation. This main printed representation will be used in several context later on. Make it clear, to the point and not too long.

pivotIndex
^ interval first
pivotValue
^ data at: self pivotIndex
printOn: stream
stream
<< ‘Selected pivot ‘; print: self pivotValue;
<< ‘ @ ‘; print: self pivotIndex;
<< ‘ from ‘; print: interval

Along the same lines we define a couple of others, one for each of the important steps in the LampSort algorithm:

  • LampSortPartitioned (left, right)
  • LampSortSwapped (oneIndex, anotherIndex)
  • LampSortIntervalSelected (interval, intervals)
  • LampSortIntervalRemoved (interval, intervals)
  • LampSortIntervalsAdded (newIntervals, intervals)

The last 3 will inherit from LampSortIntervalsEvent because they all hold the intevals set. It is a bit of work, but with the code generation tools you can automate most of the coding.

Here is how the log event objects are used.

LampSortPivotSelected new
data: data;
interval: interval;
emit

Just make a new instance, set the necessary properties and send it emit.

LampSortInstrumented

We are now ready to use the log event objects that we defined. The original LampSort implementation consists of just 2 elegant methods, sort and partition: — read the original article explaining LampSort for an detailed explanation.

To maintain that easy to read version, we are going to create a new one, LampSortInstrumented, where we introduce helper methods for each important step in the algorithm.

sort
| intervals one |
intervals := Set with: (1 to: data size).
[ intervals isEmpty ]
whileFalse: [
one := self selectIntervalFrom: intervals.
one size > 1
ifTrue: [
self
joinIntervals: intervals
with: (self partition: one) ].
self removeInterval: one from: intervals ]
partition: interval
| pivot index |
pivot := self selectFirstElementAsPivot: interval.
self swap: interval first with: interval last.
index := interval first.
interval first to: interval last — 1 do: [ :each |
(data at: each) < pivot
ifTrue: [
self swap: each with: index.
index := index + 1 ] ].
self swap: interval last with: index.
^ self splitInterval: interval around: index

In each helper methods, we’ll do 2 things:

  1. create and emit a log event
  2. do whatever has to be done in that step

All in all, there are 6 helper methods.

selectIntervalFrom: intervals
| selection |
selection := intervals anyOne.
LampSortIntervalSelected new
data: data;
intervals: intervals;
interval: selection;
emit.
^ selection
removeInterval: interval from: intervals
LampSortIntervalRemoved new
data: data;
intervals: intervals;
interval: interval;
emit.
intervals remove: interval
selectFirstElementAsPivot: interval
LampSortPivotSelected new
data: data;
interval: interval;
emit.
^ data at: interval first
swap: oneIndex with: anotherIndex
LampSortSwapped new
data: data;
oneIndex: oneIndex;
anotherIndex: anotherIndex;
emit.
data swap: oneIndex with: anotherIndex
splitInterval: interval around: pivotIndex
| left right |
left := interval first to: pivotIndex — 1.
right := pivotIndex + 1 to: interval last.
LampSortPartitioned new
data: data;
left: left;
right: right;
emit.
^ Array with: left with: right

Note how in the last method, the left and right intervals were already at hand, created as part of the algorithm. We just use them in the log object at no cost.

This concludes the instrumentation of our algorithm. In your own applications, you can do what you want to make the generation of log events as convenient as possible.

Textual Logging

One aspect of object logging is that it is compatible with traditional, textual logging. How ? By using the print representations. Here is how to log to the classic Pharo Transcript.

LampSortLogEvent announcer 
subscribe: LampSortLogEvent
do: [ :event | Transcript crShow: event ].
LampSortInstrumented new sort: #(7 12 3 20 5 8 2) copy.

We subscribe to all Announcements that are subclasses of LampSortLogEvent and show their printed representation on the Transcript. This is the result.

We can use to same technique to append to a log file or the system log. In this simplified demo, our log event objects do not contain their own timestamp — this and any other relevant information is easy to add.

Inspecting Log Events

The standard GT-Inspector tool (part of The Glamorous Toolkit) makes it easy to look at Announcements generated by any Announcer, thus we can use it to look at our log event objects. Here is how to get started.

LampSortLogEvent announcer inspect.
LampSortInstrumented new sort: #(7 12 3 20 5 8 2) copy.

Inspect the LampSortLogEvent announcer and run the algorithm on an example array. The array gets copied because the algorithm sorts destructively in-place.

We now have a tool to browse what happened.

The left pane shows each announcement that came in, most recent on top, oldest at the bottom. The time stamp is added automatically. Note how our print representation is used.

When you select an indivual log event, its details are shown in the right pane. You can see the data evolve.

By moving up and down in the list, you go effectively forward respectively backward in time. You can study in detail how the algorithm works.

Now imagine being able to do this for your complex enterprise application.

Using an Object Log

Hopefully, you can already see how log event objects might be useful in more complex situations. Here are two examples that show what a difference an object log makes.

How many swaps were executed when sorting a specific array?

(OrderedCollection streamContents: [ :out |
| subscription |
subscription := LampSortLogEvent announcer
subscribe: LampSortSwapped
do: [ :event | out nextPut: event ].
LampSortInstrumented new sort: #(7 12 3 20 5 8 2) copy.
LampSortLogEvent announcer unsubscribe: subscription ])
size

Answer: 12

By subscribing to just one type of log event we effectively filter, which gives us the number of swaps. The log event class hierarchy is one dimension for filtering an object log.

How many swaps had no effect?

(OrderedCollection streamContents: [ :out |
| subscription |
subscription := LampSortLogEvent announcer
subscribe: LampSortSwapped
do: [ :event | out nextPut: event ].
LampSortInstrumented new sort: #(7 12 3 20 5 8 2) copy.
LampSortLogEvent announcer unsubscribe: subscription ])
count: [ :event | event oneIndex = event anotherIndex ]

Answer: 1

When the source and target index of a swap are the same, the swap has no effect. Note how we can get actual useful information directly from the log event object.

An object log can be saved to a file or database or transported over the network as such using a suitable serialisation framework, like FUEL or STON. You can convert your event log objects to JSON and integrate with other tools.

Visualising LampSort

Can we do even more? Yes we can!

To explain sorting algorithms we can visualize the sorting of an array of numbers using a bar chart. Modifications to the data can be expressed as the exchange of bars.

The Agile Visualization project is an extensive, rapidly evolving eco system of libraries and frameworks on top of Pharo and Moose that allows for all kinds of visualisation, from very simple to very complex ones.

Of course, there is a learning curve. But there is documentation and lots of examples. I got help as well — thanks Pierre! Once you get started, it becomes a lot of fun.

You typically start experimenting with a script in a playground, where you can immediately see the result. Here is a screenshot of such a script in action.

Once you have enough code you can start integrating it in your application. What we are going to do is add a new, visual GT-Inspector presentation for our log event object hierarchy, built using Roassal2.

gtInspectorViewIn: composite
<gtInspectorPresentationOrder: 20>
composite roassal2
title: ‘View’;
initializeView: [ self buildInspectorView ]

By adding this method to LampSortLogEvent we create an entry point for a new presentation called ‘View’, delegating to buildInspectorView.

buildInspectorView
| builder dataSet |
dataSet := self dataSet.
builder := RTGrapher new.
builder extent: (dataSet numberOfPoints * 32 + 256) @ 256.
builder add: dataSet.
builder axisX noLabel; noTick.
builder axisY noLabel; noTick.
builder build.
self highlight: builder view.
^ self withLegend: builder view
dataSet
| dataSet |
dataSet := RTStackedDataSet new.
dataSet points: (self data withIndexCollect: [ :e :i | i -> e ]).
dataSet y: #value.
dataSet interaction popup
text: [ :e | '{1} @ {2}' format: { e value. e key } ].
dataSet barShape width: 20; color: Color lightGray.
^ dataSet
withLegend: view
^ RTLegendBuilder new
view: view;
addText: self printString;
build

RTGrapher is a builder that constructs a chart based on a dataset, in this case an RTStackedDataSet. The dataset uses index->value Associations as models for its datapoints, because we need to be able to distinguish between equal values on different positions. We add a mouse hover popup that shows the value and position of each element. We set the base color and some sizes. It makes no sense to have additional information on the axises. We are reusing our print representation as a legend.

The highlight: method will do nothing in this abstract class and will be overridden in concrete subclasses to indicate the specifics of that log event. Apart from the highlight: methods, there is no visualisation code in the subclasses. Here is the code for LampSortPivotSelected.

highlight: view
self interval do: [ :each |
(self elementFrom: view at: each) color: Color black ].
(self elementFrom: view at: self pivotIndex) color: Color blue

The interval is shown by making the bars black, the pivot bar is made blue. Each concrete subclass knows automatically what to do.

With this machinery in place, we can again open a GT-Inspector on our Announcer and run the algorithm. This time the ‘View’ tab of each log event will visually show what happened.

Again, we can go back and forth over the events, playing the algorithm forward or backward, in a sense manually animating the computation.

Above and Beyond

The visualisation that we did in the previous section does a adequate job in showing graphically what each step represents — in a reasonable amount of code.

But it wouldn’t be graphics if there wasn’t always something extra, something better that we can do. Roassal2 includes animation facilities and Pierre showed me how to apply them in this particular case. When showing a LampSortSwapped log event, we’ll not only color the two bars involved in the swap, but also animate the bars being swapped.

And here is the highlight: code for LampSortSwapped.

highlight: view
| one another |
one := self elementFrom: view at: self oneIndex.
one color: Color red.
another := self elementFrom: view at: self anotherIndex.
another color: Color green.
view addAnimation: (RTLinearMove
to: another position x @ one position y
during: 0.5
on: one).
view addAnimation: (RTLinearMove
to: one position x @ another position y
during: 0.5
on: another).
one model key: self anotherIndex.
another model key: self oneIndex

First we change the colors of the elements. Next a linear move animation is added to both elements, which will play for half a second. We already know where to move to, the location of the other element. Note how we also change the model of the elements involved by updating their index — they were moved after all — so that the popup shows the correct value while hovering over the bars.

Here is a screencast movie showing the final result — the animation wasn’t strictly necessary, but it is very cool, right ?

Acknowledgements

I thank Tudor ‘Doru’ Girba and Alexandre Bregel and their collaborators for the Moose Technology suite and the Agile Visualisation Tools respectively. Pierre Chanson helped me tremendously by writing the first version of the visualisation and animation — I learned a lot by discussing various aspects of this article with him. My ideas around object logging were shaped in various discussions on the Pharo mailing lists and in conversations with Doru in particular. Doru also reviewed a draft of this article.

Getting the code

If you want to play with the code yourself, you can clone the GitHub repository https://github.com/svenvc/lampsort.git. Open a Moose 5.1 image and use the Monticello Browser to add a FileTree repository. First, select the LampSort package and load it. Next, select the LampSortVisualized package and load it too.

Show your support

Clapping shows how much you appreciated Sven VC’s story.