Seed Knex PostgreSQL Database with JSON Data

How to seed multiple data files with a one-to-many relationship

Knex.js is a great JavaScript SQL query builder that can work with your Node.js applications. The database queries are promise-based, which can be difficult to grapple if you haven’t worked with JavaScript promises before.

We’ll go over how to integrate Knex with a PostgreSQL database and how to seed the database with JSON-formatted data. The data is not normalized by default, but this post shows how to use a one-to-many relationship with foreign keys to normalize the data. For more information about data normalization, check out this article.

The aim for this database would be to live in a Node.js application, such as an Express server backend.

The Data Model

There are two files with JSON data: a merchants file and a products file. Each product belongs to one, unique merchant, and each merchant can have many products, hence the one-to-many relationship. Many datasets, as with this one, are not normalized; they do not utilize foreign keys, which can lead to a decrease in data integrity.

Here is the JSON data for the merchants:

// merchants.js
module.exports = [
{name: ‘River North’},
{name: ‘Crow and Nail’},
{name: ‘Spotted Sparrow’},
{name: ‘Silver Buckle’}
];

This is the JSON data for the products, which references merchant’s names:

// products.js
module.exports = [
{
merchant: ‘Spotted Sparrow’,
name: ‘Hand-bound notebook with blank ivory pages’,
price: 2050
},
{
merchant: ‘River North’,
name: ‘Rose-scented beeswax candle’,
price: 929
},
{
merchant: ‘Silver Buckle’,
name: ‘Oxford button-down shirt in red’,
price: 3199
},
{
merchant: ‘Crow and Nail’,
name: ‘Recycled wood trestle bench’,
price: 120099
},
{
merchant: ‘Crow and Nail’,
name: ‘Maple wooden spoon’,
price: 2099
}
];

For this data to be normalized, we want to change the merchant property for each product to reference the id of the merchant in the merchants table. In the schema below, the merchant_id field in the products table is the foreign key that references the merchant.

Schema design tool used: http://ondras.zarovi.cz/sql/demo/

Knex Configuration

To get started, you need to install Knex and the PostgreSQL client. Knex offers compatibility with multiple databases, such as MySQL and others, but here we’re using PostgreSQL. Create a new directory, and within that directory, run:

npm init --yes
npm install --save knex pg
npm install knex -g
knex init

The -g flag on the second knex command installs knex globally so you can use knex commands from the command line. The knex init command creates a configuration file, which sets up how knex interacts with various environments (only development here) and the PostgreSQL database.

// knexfile.js
module.exports = {
  development: {
client: ‘pg’,
connection: ‘postgres://localhost/merch’,
migrations: {
directory: __dirname + ‘/db/migrations’
},
seeds: {
directory: __dirname + ‘/db/seeds/development’
}
}
};

The database name used here is merch, which you can create with the PostgreSQL commands:

psql
CREATE DATABASE merch;

Now you can start building tables with migrations and seeding data!

Build the Tables: Knex Migration

A migration creates a database schema (the structure of the database), which we need before we start seeding data. You can add/remove tables, add/remove columns, change column data types, etc. To create the migration file, run:

knex migrate:make merchants_products

For the dataset we’re using, the migration needs to create two tables: merchants and products, and each table has an auto-incrementing primary key. The notable part is the one-to-many relationship where the merchant’s primary key used as a foreign key (merchant_id) for each product.

// 20170517204151_merchants_products.js
exports.up = function(knex, Promise) {
return Promise.all([
knex.schema.createTable(‘merchants’, (table) => {
table.increments(‘id’).primary();
table.string(‘name’);
}),
    knex.schema.createTable(‘products’, (table) => {
table.increments(‘id’).primary();
table.string(‘name’);
table.integer(‘price’);
table.integer(‘merchant_id’).unsigned()
.references(‘merchants.id’);
})
]);
};
exports.down = function(knex, Promise) {
return Promise.all([
knex.schema.dropTable(‘products’),
knex.schema.dropTable(‘merchants’)
]);
};

To run the migration, use the command:

knex migrate:latest

The default environment is development, which works for us here, but if you need to specify a different environment, such as a test environment, then you can use the env flag in the migration command like:

knex migrate:latest --env test

After you run the migration, you can go into the PostgreSQL command-line client and view the tables (through psql). The products table looks like:

merch=# SELECT * FROM products;
id | name | price | merchant_id
— — + — — — + — — — -+ — — — — — — -
(0 rows)

And the merchants table looks like:

merch=# SELECT * FROM merchants;
id | name
— — + — — —
(0 rows)

Now our tables are set up and ready for data!

Seed Data

The seed file is where things can get tricky. As I mentioned before, knex queries return promises. So you have to make sure with each query you make in your seed file that you resolve the promises. If you don’t resolve a promise, then your seed file will finish executing all of the queries and prevent data from being seeded.

Here is the pseudocode for the seed file:

// Seed file pseudocode
1. Load the JSON data from each data file
2. Create a knex seed (and export it)
3. Delete any existing data in all of the tables
4. Seed all of the merchants first because we need the primary key to exist as a foreign key for each product
5. Seed each product
- Create an empty array where each promise will go
- Find the merchant 'id' for the current product being inserted
- Insert the product with the merchant 'id' as the foreign key
6. Resolve all of the product's promises

To generate a seed file for the development environment with the name merchants_products, run the command:

knex seed:make merchants_products --env development

Here is the actual seed file code:

// merchants_products.js
const merchantsData = require(‘../../../data/merchants’);
const productsData = require(‘../../../data/products’);
exports.seed = function(knex, Promise) {
return knex(‘products’).del()
.then(() => {
return knex(‘merchants’).del();
})
.then(() => {
return knex(‘merchants’).insert(merchantsData);
})
.then(() => {
let productPromises = [];
productsData.forEach((product) => {
let merchant = product.merchant;
productPromises.push(createProduct(knex, product, merchant));
});
    return Promise.all(productPromises);
});
};
const createProduct = (knex, product, merchant) => {
return knex(‘merchants’).where(‘name’, merchant).first()
.then((merchantRecord) => {
return knex(‘products’).insert({
name: product.name,
price: product.price,
merchant_id: merchantRecord.id
});
});
};

One thing to note is if you give insert() an array of objects, like with .insert(merchantsData), then the order that the objects are inserted is not guaranteed.

Let’s run the seed file!

knex seed:run

We can check the result using psql. The merchants table should look like:

merch=# SELECT * FROM merchants;
 id |      name
----+-----------------
1 | River North
2 | Crow and Nail
3 | Spotted Sparrow
4 | Silver Buckle
(4 rows)

The products table should look like:

merch=# SELECT * FROM products;
 id |                 name                    | price  | merchant_id
----+-----------------------------------------+--------+------------
1 | Hand-bound notebook with blank ivory... | 2050 | 3
2 | Oxford button-down shirt in red | 3199 | 4
3 | Recycled wood trestle bench | 120099 | 2
4 | Rose-scented beeswax candle | 929 | 1
5 | Maple wooden spoon | 2099 | 2
(5 rows)

As you can see, a merchant’s primary key (id) is referenced as the merchant_id in the products table that corresponds to the correct merchant. Now the JSON data is seeded into the normalized database!

Take a look at the source code in this GitHub repo.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.