Deneb & Vega-Lite Walkthrough Series | EP08: Small Multiples (Facets)📊

For all the movie-lovers out there, we shall utilise facets to better identify which trilogy movie is best based on IMDB ratings. 🕊️ 🧙🏼‍♂️ ✨

PBI Queryous

12 min readApr 30, 2024

💌 PBIX file available at the end of the article… Enjoy!

Recap

EP1 — Marks and Encoding
EP2 — Mark Types
EP3 — Styling Mark Propreties (Part 1)
EP4 — Styling Mark Properties (Part 2)
EP5 — Layers (Multiple Views)
EP6 — Expressions & Conditional Formatting
EP7 — Number Formatting

The Holy Trilogy

…he who finds the Holy Grail must face three challenges…

Power Query and the Last Crusade

The data in this instances was compiled manually, using a combination of IMDB web-scraping, and manual input… alas, Excel is forever your friend (and don’t you forget it! 🤓😏). After building my dataset in Excel, I created a manual input table in Power Query and pasted the values in there. Feel free to compile and improve the underlying data with more film entries!

We only require three fields:

trilogy
film
imdb_score

Vega-Lite and the Temple of Doom

We will go through a step-by-step process, making mistakes along the way, and learning from these challenges in real-time 🪄🥹 — this is a safe place for personal growth and development 🤗🫂

Starting Point:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "height": 80,
  "width": 95,
  "layer": [
    {
      "mark": {
        "type": "bar",            // bar mark
        "stroke": "#444"
      },
      "encoding": {
        "x": {
          "field": "film",        // x-axis
          "type": "ordinal",      // numerical order
          "title": null,
          "axis": {
            "labels": true,
            "labelAngle": 0,
            "ticks": false
          }
        },
        "y": {
          "field": "imdb_score",  // y-axis
          "type": "quantitative", // quantitative (aggregatable)
          "aggregate": "mean",    // aggregate the average (for now)
          "title": null,
          "axis": {
            "labels": true,
            "format": ".0%",      // percent number format, 1 decimal (%)
            "ticks": false,
            "grid": false
          }
        }
      }
    }
  ]
}

Identifying Aesthetic Gripes

The first niggle we need to address is the y-axis. We know we are working with percentage, so our range is 0% to 100% (0 to 1). By default, Deneb and PowerBI will always summarise the data values according to their min/max value range. We can explicitly set this range, very much in the same way you set the number range in PowerBI’s Y-axis options:

In Vega-Lite, we use the scale expression to modify the domain (range) of values:

"scale": {"domain": [0, 1]}

And in context:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "height": 80,
  "width": 95,
  "layer": [
    {
      "mark": {...},
      "encoding": {
        "x": {...},
        "y": {
          "field": "imdb_score",
          "type": "quantitative",
          "aggregate": "mean",
          "title": null,
          "scale": {"domain": [0, 1]},  // <-- set the y-axis value range
          "axis": {...}
        }
      }
    }
  ]
}

Chart with y-axis min/max range defined (0, 1)

Separate chart by trilogy category

So, to create small multiples, we can use our existing knowledge and add a legend, binding our trilogy field to the color channel. Let’s give it a try:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "height": 80,
  "width": 95,
  "transform": [
    
  ],
  "layer": [
    {
      "mark": {
        "type": "bar",
        "stroke": "#444",
        "strokeWidth": 0.5,
        "strokeOpacity": 0.2
      },
      "encoding": {
        "x": {
          "field": "film",
          "type": "ordinal",
          "title": null,
          "axis": {
            "labels": true,
            "labelAngle": 0,
            "ticks": false
          }
        },
        "y": {
          "field": "imdb_score",
          "type": "quantitative",
          "aggregate": "mean",
          "title": null,
          "scale": {"domain": [0, 1]},
          "axis": {
            "labels": true,
            "format": ".0%",
            "ticks": false,
            "grid": false
          }
        },
        "color": {
          "field": "trilogy"            // <--- distribute by trilogy values
        }
      }
    }
  ]
}

….launch it…. 🙏🏼

Hmmn, not quite what we had in mind. Instead of adding a new chart for each category, it has stacked the categories and changed only the colour 💡. We need a slightly different approach, but we certainly are on the right tracks… let’s break it down in to smaller parts so we can make better sense of it. We will filter down the trilogy values.

Add a filter transform:
We’ll add this transform at the top of our code to apply to the entire layer.

"transform": [
    {
      "filter": "datum.trilogy == 'Alien' || datum.trilogy == 'Jaws' || datum.trilogy == 'Mighty Ducks' "
    }
  ],

And in context:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "height": 80,
  "width": 95,

// apply a transform to the layer
// choose 3 films where trilogy = "Alien", "Jaws" or "Mighty Ducks"
  "transform": [
    {
      "filter": "datum.trilogy == 'Alien' || datum.trilogy == 'Jaws' || datum.trilogy == 'Mighty Ducks' "
    }
  ],
  "layer": [
    {
      "mark": {...},
      "encoding": {
        "x": {...},
        "y": {...},
        "color": {
          "field": "trilogy"
        }
      }
    }
  ]
}

Create small multiples:

In order to create small multiples (aka Trellis Plot), we use an express in Vega/Vega-Lite known as facet. In simple-ish terms, facets allow our plots or charts to be displayed into subsets of the same data. In this case, our subsets are values from our trilogy field. Things will shape up rather quickly now… hold on tight!

There are two parts to the facet:
1. facet by row (vertically) or column (horizontally)
2. the chart specification (spec or view)

// general format

"facet": {
    "column": {                      // <<-- the facet type (column or row)
      "field": "trilogy",            // <<-- the facetted field
      "title": "THE TRILOGY METER"   // <<-- a title for our facetted chart
    }
  },
"spec": {"layer": [...]}                         // <<-- the specification

The easiest way to amend our Vega-Lite code at this juncture is to use the trusted copy-and-paste technique:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "height": 80,
  "width": 95,
  "transform": [
    {
      "filter": "datum.trilogy == 'Alien' || datum.trilogy == 'Jaws' || datum.trilogy == 'Mighty Ducks' "
    }
  ],
  "facet": {
    "column": {
      "field": "trilogy",
      "title": "THE TRILOGY METER"
    }
  },
  "spec": {                              // <<--- START SPEC
    "layer": [                           // <<--- copy and paste layer within "spec": {}
      {
        "mark": {
          "type": "bar",
          "stroke": "#444",
          "strokeWidth": 0.5,
          "strokeOpacity": 0.2
        },
        "encoding": {
          "x": {
            "field": "film",
            "type": "ordinal",
            "title": null,
            "axis": {
              "labels": true,
              "labelAngle": 0,
              "ticks": false,
              "labelFontWeight": "bold"
            }
          },
          "y": {
            "field": "imdb_score",
            "type": "quantitative",
            "aggregate": "mean",
            "title": null,
            "scale": {"domain": [0, 1]},
            "axis": {
              "labels": true,
              "format": ".0%",
              "ticks": false,
              "grid": false
            }
          },
          "color": {"field": "trilogy"}
        }
      }
    ]                                    // layer ends
  }                                      // <<--- END SPEC
}

Next we want to improve the colour scheme so that each film (1,2,3) is assigned a colour range, darker to lighter. Let’s head over to our color encoding channel.

Color Schemes:

Vega and Vega-Lite have a wonderful library of ready-made colour schemes to choose, from categorical to sequential, and diverging to cyclical.

We apply these colour schemes by redefining the color palette as a scale range within the color channel’s scheme property.

"color": {                                  // <<-- colour channel
            "scale": {                      // <<-- colour scale
              "scheme": "yellowgreenblue"   // <<-- colour scheme
            },
            "field": "imdb_score",          // apply this to the imdb_score field
            "type": "nominal",
            "legend": null
          }

and in context:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "height": 80,
  "width": 95,
  "transform": [
    {
      "filter": "datum.trilogy == 'Alien' || datum.trilogy == 'Jaws' || datum.trilogy == 'Mighty Ducks' "
    }
  ],
  "facet": {
    "column": {
      "field": "trilogy",
      "title": "THE TRILOGY METER"
    }
  },
  "spec": {
    "layer": [
      {
        "mark": {...},                    // <-- mark property
        "encoding": {                     // <-- encoding channel
          "x": {...},
          "y": {...},
          "color": {                      // <-- colour channel                 
            "scale": {                    // <-- colour scale
              "scheme": "yellowgreenblue" // <-- colour scheme
            },
            "field": "imdb_score",        // <-- apply colour scheme to field
            "type": "nominal",
            "legend": null                // <-- set legend to null to remove
          }
        }
      }
    ]
  }
}

You probably feeling something right now, something tingling at the end of the finger-tips. This is called “winning”, this is what winning feels like. Keep going! 🤓🧙🏼‍♂️🪄

We are almost, almost there. We just need to make some further aesthetic changes, we want to make the charts neat and compact. Let’s blitz through those changes here.

Denebers of the Lost Ark

We want to create a box area around each bar to create the illusion of filling the bar out of 100%. This is actually quite simple… follow me!

First we need to create a new field inside the Vega-Lite specification, much like a measure or a calculated column, so each bar is ‘encased’ by the maximum value. So we add an additional calculate transformation to our transform block:

  "transform": [
    {
      "filter": "datum.trilogy == 'Alien' || datum.trilogy == 'Jaws' || datum.trilogy == 'Mighty Ducks' "
    },
    {"calculate": "1", "as": "max"}  // <-- new calculate transform
  ],

and a new mark object:

{
        "mark": {
          "type": "bar",
          "color": "transparent",  // <-- no colour used
          "stroke": "black"        // <-- apply a black outline
        },
        "encoding": {
          "x": {
            "field": "film",
            "type": "ordinal",
            "title": null,
            "axis": {
              "labels": true,
              "labelAngle": 0,
              "ticks": false
            }
          },
          "y": {
            "field": "max",          // <-- use the calculate transform field
            "type": "quantitative"
          }
        }
      }

…and the complete specification:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "transform": [
    {
      "filter": "datum.trilogy == 'Alien' || datum.trilogy == 'Jaws' || datum.trilogy == 'Mighty Ducks' "
    },
    {"calculate": "1", "as": "max"}
  ],
  "facet": {
    "column": {
      "field": "trilogy",
      "title": "THE TRILOGY METER"
    }
  },
  "spec": {
    "height": 80,
    "width": 95,
    "layer": [
      {
        "mark": {
          "type": "bar",
          "stroke": "#444",
          "strokeWidth": 0.5,
          "strokeOpacity": 0.2
        },
        "encoding": {
          "x": {
            "field": "film",
            "type": "ordinal",
            "title": null,
            "axis": {
              "labels": true,
              "labelAngle": 0,
              "ticks": false,
              "labelFontWeight": "bold"
            }
          },
          "y": {
            "field": "imdb_score",
            "type": "quantitative",
            "aggregate": "mean",
            "title": null,
            "scale": {"domain": [0, 1]},
            "axis": {
              "labels": true,
              "format": ".0%",
              "ticks": false,
              "grid": false
            }
          },
          "color": {
            "scale": {
              "scheme": "yellowgreenblue"
            },
            "field": "imdb_score",
            "type": "nominal",
            "legend": null
          }
        }
      },
      {
        "mark": {
          "type": "bar",
          "color": "transparent",
          "stroke": "black"
        },
        "encoding": {
          "x": {
            "field": "film",
            "type": "ordinal",
            "title": null,
            "axis": {
              "labels": true,
              "labelAngle": 0,
              "ticks": false
            }
          },
          "y": {
            "field": "max",
            "type": "quantitative"
          }
        }
      }
    ]
  }
}

This is already looking really nice:

Next, I want to do three things:

remove the percentage labels
have the film numbers appear inside each bar
remove the spaces between the bars

Remove percent labels on y-axis:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "transform": [
    {
      "filter": "datum.trilogy == 'Alien' || datum.trilogy == 'Jaws' || datum.trilogy == 'Mighty Ducks' "
    },
    {"calculate": "1", "as": "max"}
  ],
  "facet": {
    "column": {
      "field": "trilogy",
      "title": "THE TRILOGY METER"
    }
  },
  "spec": {
    "height": 80,
    "width": 95,
    "layer": [
      {
        "mark": {
          "type": "bar",
          "stroke": "#444",
          "strokeWidth": 0.5,
          "strokeOpacity": 0.2
        },
        "encoding": {
          "x": {
            "field": "film",
            "type": "ordinal",
            "title": null,
            "axis": {
              "labels": false,            // <--- set x-axis labels to 'false'
              "labelAngle": 0,
              "ticks": false,
              "labelFontWeight": "bold"
            },
            "scale": {"paddingInner": 0}
          },
          "y": {
            "field": "imdb_score",
            "type": "quantitative",
            "aggregate": "mean",
            "title": null,
            "scale": {"domain": [0, 1]},
            "axis": {
              "labels": false,            // <--- set y-axis labels to 'false'
              "format": ".0%",
              "ticks": false,
              "grid": false
            }
          },
          "color": {
            "scale": {
              "scheme": "yellowgreenblue"
            },
            "field": "imdb_score",
            "type": "nominal",
            "legend": null
          }
        }
      },
      {
        "mark": {
          "type": "bar",
          "color": "transparent",
          "stroke": "black"
        },
        "encoding": {
          "x": {
            "field": "film",
            "type": "ordinal",
            "title": null,
            "axis": {
              "labels": true,
              "labelAngle": 0,
              "ticks": false
            }
          },
          "y": {
            "field": "max",
            "type": "quantitative"
          }
        }
      },
      {
        "mark": {
          "type": "text",                // <--- text mark (background to create the 'halo' effect)
          "stroke": "black",
          "strokeWidth": 3,
          "fontSize": 14
        },
        "encoding": {
          "x": {
            "field": "film",
            "type": "ordinal",
            "title": null,
            "axis": {
              "labels": true,
              "labelAngle": 0,
              "ticks": false
            }
          },
          "y": {
            "datum": "0.1",
            "type": "quantitative"
          },
          "text": {"field": "film"}
        }
      },
      {
        "mark": {
          "type": "text",                // <--- text mark foreground
          "fill": "white",
          "fontSize": 14
        },
        "encoding": {
          "x": {
            "field": "film",
            "type": "ordinal",
            "title": null,
            "axis": {
              "labels": true,
              "labelAngle": 0,
              "ticks": false
            }
          },
          "y": {
            "datum": "0.1",
            "type": "quantitative"
          },
          "text": {"field": "film"}    // <--- display the 'film' field values
        }
      }
    ]
  }
}

Delightful😏… We are very close to completing the task at hand!

Rightio, now — I want to see all my films, so we can remove the filter from our transform block.

"transform": [
// delete this filter transform
    {
      "filter": "datum.trilogy == 'Alien' || datum.trilogy == 'Jaws' || datum.trilogy == 'Mighty Ducks' "
    },
    {"calculate": "1", "as": "max"}
  ]

so we are left with this:

"transform": [
    {"calculate": "1", "as": "max"}
  ]

You’ll notice all our films are beautifully facetted into the small multiples that we desire, but it’s all on a single row, we need wrap the small multiples on to a new row dynamically… hmmn, sounds tricky, but it’s not too bad… let’s take a look 👀

Enter VCONCAT

So we wrapped our layer in a spec, now we wrap our facet and “spec”:{} in “vconcat”:[]

Overview:

"vconcat":[
  {
    "columns": 10,    // <<-- the number of columns to facet by before vertically concatenating
    "facet": {...},
    "spec": {
      "layer": [{...}]
    }
  }
]

…and the final specification:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v5.json",
  "description": "A faceted chart showing movie ratings across different franchises.",
  "data": {"name": "dataset"},
  "transform": [
    {"calculate": "1", "as": "max"}
  ],
  "vconcat": [
    {
      "columns": 15,
      "facet": {
        "field": "trilogy",
        "header": {
          "title": "THE TRILOGY METER",
          "titleFontSize": 30,
          "titlePadding": 20,
          "labelPadding": 0,
          "labelFontWeight": "bold",
          "labelFontSize": 12,
          "labelAlign": "center",
          "labelLimit": 95,
          "labelOrient": "top"
        }
      },
      "spec": {
        "height": 80,
        "width": 95,
        "layer": [
          {
            "mark": {
              "type": "bar",
              "stroke": "#444",
              "strokeWidth": 0.5,
              "strokeOpacity": 0.2
            },
            "encoding": {
              "x": {
                "field": "film",
                "type": "ordinal",
                "title": null,
                "axis": {
                  "labels": false,
                  "labelAngle": 0,
                  "ticks": false,
                  "labelFontWeight": "bold",
                  "domain": false
                },
                "scale": {
                  "paddingInner": 0
                }
              },
              "y": {
                "field": "imdb_score",
                "type": "quantitative",
                "aggregate": "mean",
                "title": null,
                "scale": {
                  "domain": [0, 1]
                },
                "axis": {
                  "labels": false,
                  "format": ".0%",
                  "ticks": false,
                  "grid": false
                }
              },
              "color": {
                "scale": {
                  "scheme": "yellowgreenblue"
                },
                "field": "imdb_score",
                "type": "nominal",
                "legend": null
              }
            }
          },
          {
            "mark": {
              "type": "bar",
              "color": "transparent",
              "stroke": "black"
            },
            "encoding": {
              "x": {
                "field": "film",
                "type": "ordinal",
                "title": null,
                "axis": {
                  "labels": true,
                  "labelAngle": 0,
                  "ticks": false
                }
              },
              "y": {
                "field": "max",
                "type": "quantitative"
              }
            }
          },
          {
            "mark": {
              "type": "text",
              "stroke": "black",
              "strokeWidth": 3,
              "fontSize": 14
            },
            "encoding": {
              "x": {
                "field": "film",
                "type": "ordinal",
                "title": null,
                "axis": {
                  "labels": true,
                  "labelAngle": 0,
                  "ticks": false
                }
              },
              "y": {
                "datum": "0.1",
                "type": "quantitative"
              },
              "text": {"field": "film"}
            }
          },
          {
            "mark": {
              "type": "text",
              "fill": "white",
              "fontSize": 14
            },
            "encoding": {
              "x": {
                "field": "film",
                "type": "ordinal",
                "title": null,
                "axis": {
                  "labels": true,
                  "labelAngle": 0,
                  "ticks": false
                }
              },
              "y": {
                "datum": "0.1",
                "type": "quantitative"
              },
              "text": {"field": "film"}
            }
          }
        ]
      }
    }
  ]
}

Take a bow… you’ve created something beautiful. Download the .pbix below for the complete breakdown of the steps above, including some bonus examples.

I hope you found this episode useful. Until next time… #StayQueryous

🔗Github link to PBIX: EP08 — Trilogy IMDB.pbix