Couchimport Revisited

Adding one-to-many transforms to my CouchDB command-line tool

The couchimport command-line tool is a popular way of importing structured data (CSV/TSV) from a file, spreadsheet, or database into Apache CouchDB™ or IBM Cloudant. We’ve previously covered the tool here on Medium. In this article, I’ll provide a quick refresher on its functionality, and describe a helpful feature contributed by the community.

Revisiting the couchimport tool. It’s good to be back! Image: Shutterstock.

First, create your destination database and make a note of the URL (e.g. https://user:pass@host.cloudant.com) and your database name (e.g., mydb).

Then simply pipe your data file into couchimport:

As long as your input file’s first line contains the column headings, you should end up with JSON documents in your database — one per line of your input file, except the first line, whose values are used as the attributes of the JSON document.

| name          | town          |     lat |    long |
| ------------- |:-------------:| -------:| -------:|
| Bob | London | 51.5072 | -0.1275 |
| Frank | Bolton | 53.5789 | -2.429 |
| Susan | are neat | 51.1295 | 1.3089 |

This tabular data would produce JSON documents like this:

Transforms

The couchimport tool also allows a transform function to be used to modify the data before it is added to the database. For example, we could create a JavaScript file like this:

And run another import:

Our mytransform function is called with every object before it is added to the database, allowing us to:

  • Add or remove fields
  • Strip whitespace
  • Coerce data types (in this example, converting strings to numbers)
  • Filter out rows altogether
  • Turn the source data into a different form (e.g., GeoJSON)

One-to-many

The latest version of couchimport allows a single row of data to generate multiple documents, if the transform function returns an array of objects.

Let’s say we want to create separate documents for the person objects and the location objects. We would create a new transform.js that returns an array of objects that we want to insert:

And run the import again:

This time, we get two documents for each line of input:

Thanks to Gregor [Martynus] for this latest change. 😀