Ephemeral databases using Spawn

Andrew Farries
Apr 9 · 5 min read
Image for post
Image for post
Photo by Artem Kovalev on Unsplash

Spawn is a new service in development at Redgate Foundry. Spawn provisions ephemeral, cloud-hosted database servers that can be snapshotted and rolled back in seconds. Developers can forget about managing and administering database server infrastructure themselves and instead simply request a new database server from the command line and get a connection string back in seconds.

Spawn offers many advantages for teams that might be used to using shared development environments or local database servers running on developers’ machines. This article gives a case study of how a team at Redgate made the transition from using local database servers and shared environments to using spawn to create ephemeral, reproducible database servers on demand.

The application is a simple web app — there is a backend API server that needs to talk to a SQL Server instance and web frontend:

Image for post
Image for post

Each developer runs a local instance of SQL Server on their development machine. These instances contain databases that are typically quite small and contain only ‘toy’ data; they are not representative of the databases that will be used in production. In addition to these dedicated local instances, there are also some SQL Server instances that are available to the team as a whole, and are shared by all developers on the team. These shared instances typically contain larger databases that are more realistic and representative of the kinds of databases that will be found in production. The situation is summarised in this image:

Image for post
Image for post

spawn is a good fit to improve the developer experience in such a setup for these reasons:

  1. Reproducibility of development environments: with each team member running their own SQL Server instance there is no easy way to share database environments between team members. Consequently, a bug that is reproducible on one developer’s machine may not be reproducible on one someone else’s. spawn allows all team members to use the same data image, thus ensuring consistency in the database layer across all developers' machines.
  2. Cheaper integration tests: The test pyramid is predicated on the assumption that unit tests are the fastest and most reliable tests to run. While this still holds true with spawn and other container technologies, they do make integration tests easier, less flaky and no longer require maintenance of external infrastructure. This makes integration testing more viable, increasing confidence in the system as a whole.

The first step to adopting spawn within a development team is to create a data image. While it is possible to use a backup as the source for the image, it is preferable to use scripts to create it so that those scripts can be placed under source control.

With the scripts folder in place, the next step is to use it to create the image. This is done by creating a simple .yaml file describing the image to be created:

sourceType: scripts
name: development-environment
engine: mssql
teams:
- "red-gate:dev-team-one"
scriptsFolders:
- ./scripts

We can indicate which GitHub teams should have access to the image by specifying the teams field. An image can be shared with multiple GitHub teams; here the image is shared with just one team, red-gate:dev-team-one.

The source scripts folder and this .yaml file should be versioned alongside the application code.

With this .yaml file created, use spawn to generate the data image:

$ spawnctl create data -image -f development-environment.yaml

Once the image is created, each member of the team can create new data containers from it. To create a new data container from the image:

$ spawnctl create data-container -i development-environment -n dev-environment-for-joe-bloggs

Once the command completes you should see a connection string. This is the connection string for your own private data container based on the shared data image.

One of the key advantages of working with spawn hosted instances over instances that you manage yourself is the ability to snapshot and roll back the state of your databases. This ensures that your instance is always in a consistent, known state for your development workflow. To create a snapshot of your data container:

$ spawnctl save data-container <your container name>

Once this snapshot is created, you can continue to work with your data container as normal but with the added safety net that you can restore your instance to the last known state at any time. To restore your data container to the last saved revision:

spawnctl reset <your container name>

Having followed these steps, the database setup for each developer on the team looks like this:

Image for post
Image for post

Each developer is now using their own cloud-hosted development database server. These instances can be snapshotted and rolled back independently of each other. Each instance shares the same base data image, ensuring consistency of schema and data between the instances.

By using spawn to host reproducible, ephemeral database instances the developer experience has improved significantly:

  • There is no longer a need to run a SQL Server instance locally in order to do application development. This makes setup and onboarding of new team members less onerous.
  • Development SQL Server instances are no longer shared between team members; such shared instances can suffer from having multiple team members treading on each other’s toes, and could quickly get into inconsistent states when accessed by different versions of the application.
  • There is no infrastructure to manage; all database instances are hosted and ephemeral — they can be taken down and recreated in seconds.
  • All developers can use the same data image for their development databases, but have separate data containers. Having everyone on the team using the same image helps with reproducibility of issues caused by the data in the database — bugs should be easily reproducible by all team members because they all have the same test data.
  • The shared data image only needs to be created once, so it can be a large, realistic example of a customer environment. Recreating such an environment from scratch multiple times, once per developer, can be very time consuming.
  • Taking snapshots of data containers and periodically resetting back to them allows development databases to always be in a clean, consistent known state.

We are recruiting users for the Spawn beta program. Sign up here!

Icons made by Freepik, Becris & Prettycons from www.flaticon.com

Originally published at https://spawn.cc.

Ingeniously Simple

How Redgate build ingeniously simple products, from…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store