A Bulletproof Deploy for Shiny Apps

Roger Cost
Engineering Dstilled
6 min readSep 19, 2018

Here at Dstillery we’ve recently discovered a new toy: the Shiny package for R. It’s an easy and frictionless way for us to turn MySQL (or Hive) tables into interactive data visualization webapps, filled with fun buttons, sliders and knobs. You can read all about it here, but first a word of warning: if you have real work to do, you may find yourself putting it off and inventing reasons to make your own Shiny apps. It’s that cool.

70 lines of code = 1 addictive rainbow-art toy

One of the best things about Shiny is that it’s so easy to set up a server to host all your apps. All you have to do is copy the single file that contains your app (usually called app.R) into a specific directory on that server, and you will immediately have a web service up and running. No fiddling with boring config files and tedious plumbing code — what could be better?

Naturally, we started writing Shiny apps for various things and publishing the links out to the whole company. What used to take days to code in AngularJS and Python would now take a few hours from concept phase to having it in front of the end user. The data scientists on our team were having a lot of fun with this, but the engineers — born pessimists — were thinking always of the future. How long could this go on? When would the inevitable scalability cliff appear suddenly out of the rye?

As it turns out, all of the apps on the Shiny server use the R executable and packages natively installed on the server. It’s easy enough, if we need, say, a progress bar to show while we’re loading data, to install the requisite graphics package (shinysscloaders) onto the shiny-server machine.

But what happens if for whatever reason, two apps need two different versions of the same package? It’s easy to imagine a case where someone installs a package whose dependency tree includes an upgrade (or even a downgrade) to an already-installed package, and breaks someone else’s app. In this case, the first person to know that something broke would be the business end user. For us engineers, that probably ranks as our absolute least favorite way to find out about a software defect.

Other programming environments have this same problem, which is why virtual environments (think Pip or Anaconda for Python) are so popular. Fortunately, R also has its own virtual environment, called Packrat. If you create your own app-specific virtual environment and install the packages there instead of in the system’s native R library path, then two apps can happily coexist side by side and run different versions of the same package.

Easy, right? Well, yes, but at the end of the day, this means our nice and frictionless deployment process becomes larded with a big “gotcha”. Forget to do this one simple thing, and your app could break months later without your knowledge! And nothing will remind you of that step until it’s too late! Clearly we need to think of a Real Deploy Process. But we also don’t want to add tedious steps and slow down iteration on these very nifty and high-business-value tools.

For inspiration we turn to the standard Dstillery packaging pattern. Used most often for Java applications (which are the bread and butter of our engineering organization) the process consists of:

  • Code is hosted in a Git repository
  • Maven compiles and builds an RPM
  • The RPM gets deployed to an internal repo
  • Using Salt, we specify which version of which RPM gets deployed to which machine(s)
  • Salt can also be set up with post-install steps if necessary

Since we have a lot of Java applications and most of them use this standard procedure, it’s second nature for our engineers to deploy apps using this process. All of the steps are automated, so deploying an app to production is done with a single click. In fact, it’s even easier than manually copying a file onto the Shiny server’s app directory.

Fortunately for us, Maven — while designed with Java in mind — is not restricted to being used with Java. It can build an RPM out of anything, with the right configuration. And it turns out that none of the other elements of this standard packaging pattern assume Java either. An RPM is an RPM, and a post install script is a post install script.

The only difficulty that arises is that Packrat insists upon putting its packages inside the same directory where the app.R file lives. R packages are big binaries and app.R files are small text files, and we certainly do not want to have both of those coexisting in a Git repository. Generally speaking, binaries are no good in Git repos, because of the way Git tracks changes (using diffs, for compactness).

However, it’s possible to fully define a Packrat environment via a text file of package details, called packrat.lock, and an accompanying config file called packrat.opts. These two text files fits nicely into a Git repo alongside the app.R file, and we can reconstitute the environment from them using the packrat::restore() command in R.

This is very helpful, because it means we can fully define both our app and our virtual environment using text files, and package the whole thing into a Git repo. Then, we can use Salt to invoke a post-install script that “reconstitutes” the Packrat environment inside the installation directory after the RPM is installed.

For a new developer starting a Shiny server project, the process looks like this:

  1. Write your app.R code and test it locally. You can do this without a Packrat environment if you want to.
  2. Once code complete, open an R session, navigate to the directory containing your app.R, and run packrat::init(). This command will read through your code and build a dependency tree consisting of the packages you’ve referenced in library(…) statements in your code, and all of their dependencies. Then it will download those packages and put them into the app directory alongside the packrat.lock and packrat.opts files.
  3. Then type packrat::on() to activate the environment. Now, when you test your app locally, it will use the Packrat environment and will not reference any packages installed on the system.
  4. If your app directory is not already a Git repo, you can set one up. Make sure to add the packrat/lib/ directory to the .gitignore, so that the big binaries do not get added as Git elements.
  5. Add a post-install R script and an accompanying Bash wrapper script to the Git repo (like the ones in the above diagram).
  6. Add a pom.xml and use it to configure the maven-rpm-plugin to build an RPM with the contents of the repo. (Maven does allow you to specify a post-install script, but don’t. It is better to do it in Salt, because it gives you more control over what user to run it as and what types of events qualify as an install.)

Now you have a Git repo that can be plugged into the Dstillery deployment framework as-is — nobody besides you has to even know that it’s an R project and not a Java project!

You don’t have to do all six of these steps every time, just the first time you write a new Shiny app. After that, further development on the app is a matter of changing the relevant code in app.R, committing the change, and pressing a button. (Oops! I forgot to say testing the change locally, but you already did that, of course!)

Not only that, but the second time you create a new Shiny app, you can basically copy the scripts and pom file from the first one and change the app name. With practice, the extra scripts and packaging steps will no longer impose any cognitive overhead. Then, you can focus 100% on building your apps, and not worry about them breaking in production.

--

--