Interfacing OCaml and PostgreSQL with Caqti

On dealing with dependencies in your Dune-powered OCaml app and interfacing with the most popular DBMS in town.

Elephant. [source]

This article is part of Hands-on OCaml, a series of articles that I’m working on that is focusing on doing web app development with OCaml.

The project we will be building throughout the series is a To-Do List app, which connects to a PostgreSQL database as its datastore. In the previous article, we have covered initializing and bootstrapping our project with Dune; if you haven’t seen it, check out the link below:

This tutorial will build upon the foundation we have laid out in the previous article in the series. While you can of course follow along without actually doing the tutorial, it is recommended to give the article a read first to be sure that we have the required knowledge in place.

In this second article of Hands-on OCaml series, we will explore how to manage dependencies in OCaml project; in particular, we will bring in and use a third-party library to deal with DB operations.

Requirements

Assuming you followed the previous article, you should have both opam and jbuilder installed. Verify their installation as follows:

$ opam --version
1.2.2
$ jbuilder --version
1.0+beta20

We would also need to have a local PostgreSQL instance up and running. I’m assuming readers have had prior experience with PostgreSQL, so I’m not going to expand about it here, but you should be able to install it via your OS package manager (e.g. apt on Ubuntu or brew on MacOS) or via a Docker container. Verify that it is running and and you can connect to it via psql. At the time of this writing, I am using locally installed PostgreSQL 10.4.

Preparing our project

In this section we’ll see how we will prepare our project. Again, this tutorial assumes that you have done the initial Dune setup tutorial. If you haven’t, do check it out. You might also want to remove the previously created .mli and .ml files from bin and lib since we won’t need them anymore.

Our first opam file

First off, we will fetch some dependencies! Unlike the previous tutorial where we install packages directly, we are going to use a different and cleaner method of installing dependencies: through opam files.

Make sure you’re in the todolist directory, create a file named todolist.opam with the following contents:

This opam file describes our project in a minimal way. We only defined the project’s metadata like name and version, as well a list of dependencies of our project. We can of course directly invoke opam install lwt caqti <etc.>, but listing them this way will make it more reproducible and let us avoid enumerating dependencies, especially if the list becomes very large.

For a more thorough documentation on opam files, you can consult the Packaging docs of opam (or here for the opam v2 version).

In this particular instance, we describe that our project depends to a number of packages, namely:

  1. lwt, a concurrency library for OCaml. Lwt makes it a breeze to deal with non-blocking, asynchronous operations. At the point of this writing, the latest version is 4.0.1.
  2. lwt_ppx, the syntax extension (PPX) for Lwt which allows us to conveniently write asynchronous code sequentially, just like synchronous code. At the point of this writing, the latest version is 1.2.0.
  3. caqti, a library that provides type-safe abstraction for interfacing with databases. There are a number of dedicated PostgreSQL client libraries out there that you can use, but Caqti is quite nice so I’m going with it. At the point of this writing, the latest version is 0.11.0.
  4. caqti-lwt, Caqti’s Lwt adapter for concurrency. Aside from Lwt, Caqti also supports an adapter for Async.
  5. caqti-driver-postgresql, as the name describes, is the PostgreSQL driver for Caqti. Caqti aims to be DB agnostic, so there are also drivers for other DB engines, such as MariaDB and Sqlite3.

Pinning your package and installing dependencies

Next, we will “pin” the todolist package.

One of opam’s neat features is the ability to pin a package to a specific version. A package can be pinned from a number of sources, such as a local directory, an archive URL, or a source control repo URL from git, mercurial, or darcs. Pinning packages is useful in development where you might want to install packages that don’t currently exist in the official opam repository.

In our case, we want to leverage the installation capability of opam pins to deal with our dependencies.

A little note: if you’re an opam v2 user and you’ve configured git in the directory, don’t forget to commit todolist.opam to make sure that opam can find it.

Run the following command on the todolist directory:

$ opam pin add -yn todolist .

The command will pin the package contained in the current directory (.) with the name todolist. The -yn flag is answering yes and no to the following prompts to create a new package (yes) and installing the package (no). We don’t need to install it yet, since we aim to hack the code and let Dune build our code interactively. However, we do want opam to install the required dependencies for us, so let’s do that:

$ opam install --deps-only todolist

This command will prompt you to fetch, build, and install the dependencies of the project we listed above. This might take a while!

Adding library dependency to Dune

Now that we have our dependency packages installed, we need to tell Dune about it. From the previous Dune setup article, we currently have bin and lib directories in our todolist directory. We will put the majority of our code in lib, so let’s open up our lib/jbuild file and change it into the following:

In particular, we listed the dependencies of lib inside the libraries stanza. Dune allows us to have a fine-grained dependencies per library as shown in this example. We also registered lwt_ppx as a preprocessor so that we can use Lwt’s syntax extensions for the code in lib.

With this, our preparation is complete! Let’s move on to the main course.

Communicating with PostgreSQL

We’re now ready to write some code to access the database.

Creating the interface

As I’m a big fan of API driven development, let’s start with defining the interface of our module. Here is the interface that we are going with, located in lib/todos.mli:

There’s nothing really fancy here, but I’ll try to go through it. I defined two types: todo, which is a record with two fields id and content, and error variant which have only one constructor, Database_error.

I also exposed convenience functions for migrations and rollbacks (i.e. creating and dropping table), but in reality you may want to use mature tools such as Sqitch for this.

For the core functionalities, we want to be able to get the list of todo entries, add/remove entries, and clear the list. There are many things we can add, but for the sake of simplicity let’s stick with only these.

You may notice that the return type of these functions are in the form of ('a, 'b) result Lwt.t. The type 'a Lwt.t indicates that our function returns a promise with value of type 'a, i.e. it does an asynchronous operation. The type ('a, 'b) result indicates an operation that can fail, where it can return either Ok value with value of type 'a or Error err with err of type 'b. This pattern is called Result (in some other languages it is also known as Either); it is a pretty common pattern for error-handling in functional programming languages, and is in some cases a safe alternative to raising exceptions.

So here, for example, (todo list, error) result Lwt.t on get_all means that get_all will return a promise that will eventually resolve into either a list of todos or an error.

Connecting to PostgreSQL

Assuming that the PostgreSQL instance is running correctly in our machine (localhost), put the following code inside lib/todos.ml.

In this code, we used the connect_pool function of Caqti_lwt module to get a pool of connections to the DB. Caqti also provides the connect function to get a single connection, but using a pool is better in that we can constrain our resource use.

Implementing the interface

We have the interface and the DB connection code in place, let’s fill lib/todos.ml with the rest of the implementation!

First, let’s define the types and a helper function:

Hopefully it’s straightforward. I defined the or_error function so that we don’t actually leak Caqti exceptions from our implementation, but instead we expose our own error type. This may or may not be desired, but for the purpose of this tutorial, I think it’s good enough.

Now, let’s write some queries!

At this point, we have provided the implementations for migrate and rollback functions. The pattern that I want to show you in using Caqti is that you need to define your query, the function to execute that query given a connection, and the Caqti_lwt.Pool.use call that will take a connection from the pool and feed it to the execution function.

The query part is interesting: it allows you build a type-verifiable query. In this case, Caqti_request.exec means that you want to execute a statement without returning a value, and since both CREATE TABLE and DROP TABLE does not require a parameter we put Caqti_type.unit as the first argument of it. We then provide the SQL in the form of a string ({| … |} is OCaml’s syntax for multiline strings). We’ll deal with parameters and return types in the next part.

Also note that I stubbed get_all, add, remove, and clear with failwith just so that we can get the code to compile.

Let’s try it out interactively! Invoke utop via Dune using the command jbuilder utop lib.

$ jbuilder utop lib
utop #

First off, let’s try our migrate function:

utop # open Lib;;
utop # Todos.migrate ();;
- : (unit, Todos.error) result = Ok ()

Neat, it worked! You can check through psql -c "\dt" and see that the todos table is really created. You may also have noticed that utop conveniently resolved Lwt promises for us, so that we can immediately see the result. Let’s try migrating again, this time it should show an error:

utop # Todos.migrate ();;
- : (unit, Todos.error) result =
Error
(Lib.Todos.Database_error
"Request to <postgresql://localhost:5432> failed: ERROR: relation \"todos\" already exists\n Query: \" CREATE TABLE todos (\\n id SERIAL NOT NULL PRIMARY KEY,\\n content VARCHAR\\n )\\n \".")

It errored as expected because todos already exists. Seems like our code handled the error case just fine. The string is a bit muddled since it escaped the pretty-formatted characters, but we could more or less still read it. What about rolling back?

utop # Todos.rollback ();;
- : (unit, Todos.error) result = Ok ()

It is successfully executed. Now if you checked psql -c "\dt" again you’ll see that the todos table is no longer listed.

Let’s continue filling the rest of the implementation:

For get_all, since we want to get the returned values, we use collect instead of exec. The query means I want to build a collect query that accepts no parameters (Caqti_type.unit) and returned a two-tuple of int and string (Caqti_type.(tup2 int string)). On the executing function, we use fold to actually build the list of todos from the returned tuples.

For add, we now need a query parameter. To read it, add_query is a query that returns nothing (Caqti_request.exec) and accepts a parameter of type string (Caqti_type.string). The placeholder for the given parameters is denoted by question marks (?) in the query string. Note that we now pass the parameter through the executing function (add’).

remove is also similar: it accepts a parameter id of type int and returns unit.

For clear, it accepts unit and returns unit.

We now have all the functions implemented! Let’s try it out interactively via utop. Assuming that the previous session is still open (if not, invoke jbuilder utop lib and open Lib;; again):

utop # Todos.get_all ();;
- : (Todos.todo list, Todos.error) result = Ok []
utop # Todos.add "Learn OCaml";;
- : (unit, Todos.error) result = Ok ()
utop # Todos.add "Build the next big thing";;
- : (unit, Todos.error) result = Ok ()
utop # Todos.get_all ();;
- : (Todos.todo list, Todos.error) result =
Ok
[{Lib.Todos.id = 2; content = "Build the next big thing"};
{Lib.Todos.id = 1; content = "Learn OCaml"}]
utop # Todos.remove 2;;
- : (unit, Todos.error) result = Ok ()
utop # Todos.get_all ();;
- : (Todos.todo list, Todos.error) result =
Ok [{Lib.Todos.id = 1; content = "Learn OCaml"}]
utop # Todos.clear ();;
- : (unit, Todos.error) result = Ok ()
utop # Todos.get_all ();;
- : (Todos.todo list, Todos.error) result = Ok []

Phew! Seems like all is working well. Congratulations on building your own (very basic) To-Do List app with OCaml!

Building simple migration executables

As a bonus, let’s try making the migration and rollback functions available as executables. Create a bin/migrate.ml with the following contents:

Also create bin/rollback.ml with the following:

Last, replace the bin/jbuild file with the following contents:

Now we can run migrations and rollbacks with jbuilder exec, like this:

$ jbuilder exec rollback
Dropping todos table.
Done.
$ jbuilder exec migrate
Creating todos table.
Done.

Isn’t that neat?

In practice, I often see binary executables’ entry points are implemented as tiny wrappers of the library, effectively containing very little logic, which is exactly what we did here. To rename the executables, you can change the ones in public_names stanza, and it doesn’t have to be the same with the names in, uh, names.

We can actually have only a single entry point that supports arguments and parameters (which involves the use of command parsing library such as the builtin Arg module, cmdliner, or Jane Street Core’s Command module), and converting our current executables to that could be a nice exercise.

What’s next?

This concludes our journey with OCaml, Dune, and PostgreSQL.

I see Caqti as a neat library to make sure my types are correct on the edges of my query, but if that confuses you, you can perhaps take a look at my Ezpostgresql library which aims to be more beginner-friendly but less type-safe than other alternatives. Unfortunately Ezpostgresql is not on opam yet, so you would need to pin it to be able to use it in your projects.

In the next article of the series, I will be exploring on how to expose our app in as a JSON API through the use of the Opium, a Sinatra-like web toolkit for OCaml. I will update this section with a link to that post when it becomes available.


Blocked? Have questions? A few channels of communication you can try: discuss.ocaml.org official Discourse forum, the ReasonML Discord, /r/ocaml on Reddit, and #ocaml on Freenode. The community is full of friendly folks more than willing to help you out! You can also hit me on Twitter to ask anything.

So that’s it for today, folks! Thanks for reading, I hope you get something out of this post. Tell me in the comments if I can improve anything.