
TLDR:
The app is here. It’s a searchable database of every voter in WA state. The app isn’t the main product, I wanted to learn about deploying to production away from Heroku. I used Digital Ocean VPS’s to do this. If all you want to know is how to do that go here.
Background
Disclaimer: Like all of my blogs, this is documenting a learning experience. This one especially is pretty far outside my domain of experience, and is as much notes for me as it is meant to be helpful to others. I would love to hear about any corrections and suggestions that you might have.

One thing that I want to get better at is understanding the process for deploying an app to a production environment. Almost all of my experience is in running in the development environment on my own machine, and in deploying to Heroku. Heroku is an amazing service, but it definitely follows the convention over configuration philosophy.
This dive into the dev-ops pool was inspired in turn by an app I’m building to play with optimizing SQL queries, and experiment with MongoDB in Rails, as well as learning some light data visualization (more on all of that next week). I intentionally chose a big dataset to play with to make optimizations really obvious, the Washington State voter records database. In raw form it is a CSV of 4.6 million rows listing every voter in the state. It comes with a bonus CSV listing the date and voter associated with every single ballot cast since 2015. It comes out to 1.2 GB of CSV. This would get expensive on Heroku quickly (this app would cost $16–57/mo. depending on DB design). A basic VPS that can run this app comes in at $2.50-$5/mo (there are even some basic free VPS plans, I’m not sure I trust that though).

As I mentioned earlier Heroku is strong on convention, what that gets you is ease of use. Deploying to Heroku is literally three commands. What it costs you is flexibility. Heroku has a way that it wants you to do everything, and that’s that.
I decided that at least knowing how to deploy to production on a real server, even on a small scale, is definitely a worthy skill.
In Practice
I decided to go the VPS route since it offers the most configuration, and would be the best learning experience in terms of universal applicability. There’s a lot to choose from and there are far more knowledgeable people out there comparing them. I went with Digital Ocean on the advice of a friend.
The actual process of configuring the server is something that is very well documented, digital oceans has great guides for their servers, but which would apply equally well to almost any VPS. You can opt to have rails and all of the dependencies pre-installed, or just have a barebones server with Ubuntu.
The details of the actual steps to deploy depend on how you are planning on getting it done, and how much manual control you want over the process. Google is definitely your friend here.
Two tools that are definitely very useful are Dokku and Capistrano. Both are designed to automate deploying an app to a server from a local machine. Dokku is largely modeled to be a lot like a private version of Heroku (again some flexibility is sacrificed for convenience.) Capistrano is a little more involved to set up, but is very powerful and flexible once it’s working. I deployed the same app on both to two different VPS’s, just to see how it worked. Oddly enough, performance is way better on the app deployed using Dokku (page load times of 6s. vs 4s… step two of this project is doing database and query optimizations to get that down to a more acceptable time). Not sure why that difference exists, as the apps and databases are otherwise identical.
Lessons learned
Devops is such a huge skillset that its really a career path. However, a little bit of understanding the process goes a long way towards helping me understand even more about the frameworks and tools that I’m using to create my software.
Linux: I use the terminal all day in OS X (I just learned that it’s now called MacOS as of 2016. Go figure). But setting up servers remotely requires operating exclusively in the terminal. Learning how to set up users, permissions, rsa keys and all of the admin tasks was step 1. It was a bad start, the first thing I did was accidentally turn off root SSH permissions without turning on SSH permissions for any other users. Oops. Locked out in the first 10 minutes. Eventually I got in the swing of things. I will say that with all the time I’ve spent SSHed into other machines this week, I am becoming a big fan of VIM.
This process felt a lot like TDD: try something, get an error, research the fix, move to the next error.
Deploying really reinforced the TDD aspect. As already mentioned I don’t have enough experience to write a guide, but I will say that Capistrano and Dokku are great. Dokku was definitely the easiest for me (Digital Ocean can even pre-load a server with Dokku). Capistrano was a little more involved, but seemed more transparent about what was happening, and a little more adaptable.
Challenges: Seeding the database was the hardest part about this whole process. I did learn a ton about how Postgres operates and some of the amazing features that it offers as a client/server service. Remember, the seed file CSV that I was using was 1.2 GB, and the seed rake task took about 2 hours on a Macbook pro with heaps of resources. If possible, I wanted to avoid trying to seed the database from the CSV files, especially on a VPS with 512 mb of memory and a low powered processor.
There are a lot of ways to make that seed happen, but again, it feels like the TDD process of fixing a chain of errors. In the end, I wrote a custom rake task that called on Postgres to make a database .dump from the fully seeded DB on my dev machine. I then used Rsync to transfer that file over SSH to the server. I had another rake task set-up that reversed the process on the server, taking the .dump file and using that to seed the database. That knocked the seeding process down to less than 5 mins.
Conclusions:
Is this really a practical way to deploy? I suspect the answer is, as always, it depends. It took me two days to deploy my app the first time. Now that I know how to do it, and how to solve the errors that I run into, I suspect that I could get it done in an hour or so. The same deploy to Heroku could be done in a matter of minutes, if that. But their are definite advantages to having full control over the machine running your software. I think that it would really depend on the use case, and the tradeoff between developer time, and server budget.
I do feel a lot more comfortable with what is going on under the hood in a server now that I’ve seen the spectrum of deployment options. It’s another tool that I can reach for, and something that I suspect will come in handy in my career.
Call to action
I love developing and writing about what I’m doing.
- If you have feedback I would love to hear it, that’s what the comment section is for. Or send me an email: JWPincus@gmail.com
- If you liked this project, please click the clapper right below this.
- If you want to share it with the world: please do. Twitter, Linkedin, Facebook, Myspace, Friendster, Reddit are all options. Not Yo, however.
- I moved up to Seattle, and am looking for a position. I’d love to talk development over coffee, if you’re in the area (even if you aren’t looking to hire). If you’re not near Seattle, I carry a phone with me at -literally- all times. Get in touch!
