Evolution of an internal tool

Internal tool doesn’t have to suck

Andrew Zheng
zurassic

--

Back a year ago a program I was working on was requested to add a new feature to support OBD-II. The software needs to support all the PIDs and DTCs in the SAE documents, a database is a natural choice to store all the information. I modeled the data and created the schema. The next step is to enter the data into the database. I created a windows program to let our engineer enter all the data:

Things went well, our engineer entered about 100 data items then the project is kind of stopped: we were then working on other higher priority projects.

A year later, I’m requested to pick up the project. When I open the windows program to try to enter the data, I feel like I forget everything and I’m really confused using the program I wrote. I realize we have problems using this approach:

1. it’s not very suitable for version control.
We check in the actual database file (*.db3) and it’s not easy to diff a db3 file. The changes is not traceable.

2. it’s not efficient, it’s manual and it won’t scale.
I cannot imagine I need to enter an item by opening 3 dialogs and repeat this for 500 times.

Because programmers are lazy, I don’t want to enter the data manually, I begin to think how to improve the process.

At the core I need something:
- that can make the change to the database tracable and manageble.
- that can mass create database items or update database.

I realize Ruby on Rails has the similar concept: db:migrate and db:seed
I know I can use Ruby to write a DSL to solve the program. However, our engineers don’t know Ruby, it’s not very convennient for them to write DSL in ruby.

I need to find something that’s easy for them.

The easist file format is always text file. Since we’re dealing with a set of data items that has common attributes, the format we want is csv.
And almost everyone knows how to use Excel to edit csv file.

So here is my final solution:
csv + ruby script

csv: used to enter data
ruby script: parse csv and populate database

The solution is evolved from a proprietary windows program to an open file format plus a script. It separates the data and the function importing the data.

The open file format and the separation brings several benefits:

- version control: since the data is stored as csv, it’s really easy to diff.

- data re-creation: at anytime you can delete all the rows in the database and I can easily import the whole data in a second. Also, I can dump the database using the same script to a csv file so a backup is easy to create.

- ability to quick view the data: you can just open the csv file to have a quick view of the data, instead of open the database and look at it in raw.

- programtically query the data: thanks to the awesome `sequel` gem, I can load the script in `irb` or `pry` and do high level database queries such as `map` and `reduce`. Think it as `rails console`.

Other notes on implementation:
On the csv file, I allow empty cells. Empty cell has default value for different columns. This reduces the noise of the csv file and it lets you easily spot the difference.

--

--

Andrew Zheng
zurassic

Full stack problem solver with a passion for simplicity. Personal site: http://zurassic.com