My earlier blog A Topological Trick for Data Visualisation described Carlsson’s mapper construction, which offers a good graph representation of a high-dimensional point cloud. I offered an interactive shiny demo of mapper on the Zip data set of hand-written digits — shown in the graphic above and linked here.

In that blog, I focussed on the mathematical idea behind mapper and said nothing about how the demo worked. In this blog, I want to say a little bit about the R shiny code behind the demo, in order to make it reproducible for interested readers.

For those already familiar with R, shiny provides a very easy way to build interactive graphics using the R language. There are lots of good tutorials to be found, so all I’ll say by way of introduction is that a shiny application is basically defined by two functions shinyUI() and shinyServer(), usually saved in their own source files UI.R and server.R. A first example to look at is Hello Shiny!.

So, with that said, the UI code for the mapper demo is simply the following:

shinyUI(fluidPage(

titlePanel("A topological trick for data visualisation"),
sidebarLayout(
position = "right",
sidebarPanel(
sliderInput("nbins",
label = "Resolution (nr bins):",
min = 1, max = 50, value = 10),
sliderInput("klevel",
label = "Resolution (heirarchical cluster k):",
min = 1, max = 20, value = 10),
numericInput("obsvertex",
label = "Sample vertex:",
value = 1),
plotOutput("digitview")
),
mainPanel(
plotOutput("graphview")
)
)
)
)

This says that there’s a side panel with some controls plus a plot output “digitview”, and a main panel with another plot output “graphview”. These two plot outputs will be defined in the server file, which we’ll look at in a moment.

Besides these two outputs, we also see three “input”s in the sidebarpanel. These define variables (for example input$nbins) that are picked up from the UI and used by the server function. This repeats the pattern you see in the Hello Shiny! app.

So next, the server file:

source("helpers.R")  # stores some functions needed below

load(data) # the Zip data

shinyServer(

function(input, output) {

# build clusters:
cluster.set <- reactive({
make.clusters(data,
input$nbins,
input$klevel)
})
mg <- reactive({
mapper.graph(cluster.set())
})

# outputs:
output$graphview <- renderPlot({
show.graph(mg(),
cluster.set(),
v = input$obsvertex)
})
output$digitview <- renderPlot({
show.digits(cluster.set(),
input$obsvertex)
})
}
)

Again, the structure is simple. Starting at the bottom of this code, the two outputs “digitview” and “graphview” are defined. Each of these calls renderPlot(), which gives a reactive version of some plotting code for passing to the server output that gets picked up by the UI.

For the two cases, the plotting code is defined by functions show.graph() and show.digits() — which I’ve defined in a source file ‘helpers.R’, and I’ll come to in a moment.

These two functions depend on the data clusters, and the graph structure on these clusters, as I’ve described conceptually in the previous blog. And they depend on these via the user inputs nbins, klevel, obsvertex. These variables change in response to the UI controls, so the functions mg(), cluster.set() are defined to be ‘reactive’ — that is, to link in real time to the input variables.

At this point you can see that the work of implementing the construction I described in the previous blog is wrapped up in the functions make.clusters(), mapper.graph(), show.graph() and show.digits(). I won’t go into all of these in gory detail. The reader should get the idea of the first two from the previous blog.

But I will say a bit about show.graph(), as this uses the excellent library igraph and the reader may or may not be familiar with this.

show.graph <- function(g, cset, v=-1){

# connected components of igraph object g
cc <- clusters(g)

# edge parameters
E(g)$color <- "grey"
E(g)$width <- 1
E(g)$arrow.mode <- 0
E(g)$curved <- FALSE

# vertex - size by cluster size
V(g)$label.cex <- 0.3*(1 + log(sapply(cset, length)))
V(g)$size <- 6 * V(g)$label.cex

# vertex - label by most common digit in the cluster
V(g)$label <- ""
V(g)$label <- sapply(V(g), function(v){
tmp <- y[cset[[v]]];
names(sort(-table(tmp)))[1]
})

# vertex - colour by connected component
V(g)$color <- "white"
V(g)$frame.color <- cc$membership
V(g)$label.color <- cc$membership

# highlight the selected base vertex
if(v > 0){ V(g)$color[v] <- "orange" }

# output igraph plot
plot(g)
}

The function takes inputs a graph g (an igraph object), a cluster set (of vertex indices) cset and a selected vertex v (set as -1, or ‘none’ by default). It outputs a plot that looks like:

--

--