Migrating with user-defined functions

Rob Sutter
The document-relational database blog
12 min readSep 10, 2021

Modern applications frequently change as you deliver features and fixes to customers. In this series, you learn how to implement migrations in Fauna, the data API for modern applications.

The first post in this series introduces a high-level strategy for planning and running migrations. In this post, you learn how to implement migration patterns using user-defined functions (UDFs) written in the Fauna Query Language (FQL).

All of the code in this series is available on GitHub in Fauna Labs.

Migration scenario

Imagine you use an application to manage computers and appliances that communicate on your company’s network. One of your domain objects is a firewall rule, which permits or denies inbound traffic to a resource on a given port from a specific range of IP addresses. The range of IP addresses is provided using CIDR notation and stored in your Fauna database as an FQL string

For this post, you have an updated user requirement to permit or deny inbound traffic on a particular port from an arbitrary number of IP address ranges. To satisfy this requirement, you must migrate the ipRange field type to an FQL array of strings. This demonstrates a common migration, changing a singleton to a collection as requirements evolve.

Pre-requisites

To follow along with this post you must have access to a Fauna account. You can register for a free Fauna account and benefit from Fauna’s free tier while you learn and build. You do not need to provide payment information until you upgrade your plan.

You do not need to install any additional software or tools. All examples in this post can be run in the web shell in the Fauna dashboard.

Create and populate your database

Create a new database in the Fauna dashboard. Do not select the “Pre-populate with demo data” checkbox.

Select the Shell tab (>_) to open the web shell. Copy and paste the following FQL into the editor window and choose Run Query to add three basic firewall rules.

Do(
Create(
Collection("firewall_rules"),
{
data: {
action: "Allow",
port: 80,
ipRange: "0.0.0.0/0",
description: "Universal HTTP access"
}
}
),
Create(
Collection("firewall_rules"),
{
data: {
action: "Allow",
port: 443,
ipRange: "0.0.0.0/0",
description: "Universal HTTPS access"
}
}
),
Create(
Collection("firewall_rules"),
{
data: {
action: "Allow",
port: 22,
ipRange: "192.0.2.0/24",
description: "Allow SSH from company headquarters"
}
}
)
)

Choose the Collections tab and select the firewall_rules collection. You should see three documents, each describing one of the previous firewall rules.

Encapsulating your data

In the first post in this series, you learn always to access your data via user-defined functions (UDFs). Before migrating your database, create UDFs that provide create, retrieve, update, and delete functionality.

Note: Do not abstract your UDFs to generic functions that wrap all operations across all collections! This approach introduces additional complexity and tightly couples your UDFs. Creating one UDF for each data access pattern aligns your UDFs with your business rules, making them easier to manage and more likely to be correct. This is especially true if you access your database via GraphQL, as you learn in the next post.

Create

Select the Functions tab in the Fauna dashboard and choose New function. Enter create_firewall_rule as the Function Name, leave the default value for Role, and paste the following FQL as the Function Body. Choose Save to create the UDF in your database.

Query(
Lambda(
"new_rule",
Create(
Collection("firewall_rules"),
Var("new_rule")
)
)
)

This creates a UDF that accepts one parameter object, new_rule, and stores its value as a new document in the collection firewall_rules. This mirrors the format of the FQL Create() primitive, which accepts a collection name and a parameter object.

Passing a single object containing the required parameters is a best practice. Do not deconstruct the object into multiple parameters. Deconstructing can lead to incompatibilities, and future calls may fail if they do not provide the right parameters in the right order.

Test your UDF by selecting the Shell tab (>_) and pasting the following FQL query into the code editor. Choose Run Query to call your function and create a new firewall rule.

Call(
"create_firewall_rule",
{
data: {
action: "Deny",
port: 25,
ipRange: "0.0.0.0/0",
description: "Deny SMTP"
}
}
)

Return to the Collections tab and select the firewall_rules collection. You should now see four documents, including the new rule. Copy the id of the new rule for use in the next section.

Retrieve

Select the Functions tab in the Fauna dashboard and choose New function. This time, enter get_firewall_rule as the Function Name, leave the default value for Role, and paste the following FQL as the Function Body. Choose Save to create the UDF in your database.

Query(
Lambda(
["id"],
Get(
Ref(Collection("FirewallRule"), Var("id"))
)
)
)

This creates a UDF that accepts one parameter object, id, and retrieves the entire associated document. Test your new UDF by selecting the Shell tab (>_) and pasting the following FQL query into the code editor, replacing <some_id> with the id you copy from the output of running your create_firewall_rule function.

Call("get_firewall_rule", "<some_id>")

Choose Run Query to call your UDF. Your function should return the firewall rule you provided, along with its reference and timestamp.

Update

Return to the Functions tab and create another function named update_firewall_rule with the following FQL as the Function Body.

Query(
Lambda(
["id", "new_rule"],
Update(
Ref(Collection("firewall_rules"), Var("id")),
Var("new_rule")
)
)
)

This UDF differs slightly from the first two. It accepts two parameters, the id of the document to update, and a data object new_rule containing the fields to be updated.

Note: Updates in Fauna are not destructive. The fields of the provided document are merged with the existing document. To remove a field from a document, set the value of the field to null when updating the document.

You can prove this by testing your new UDF. In the web shell, paste the following FQL query, again replacing <some_id> with the id you copy when creating your firewall rule.

Call(
"update_firewall_rule",
[
"<some_id>",
{ data: { action: "Allow", description: null } }
]
)

Choose Run Query to call your UDF. Your function should return the updated document, with the action field modified to “ Allow”, the description field removed, and the port and ipRange fields unchanged.

Delete

Create another UDF named delete_firewall_rule from the Functions tab with the following FQL.

Query(
Lambda(
["id"],
Delete(
Ref(Collection("firewall_rules"), Var("id"))
)
)
)

Test your UDF by pasting the following FQL in the web shell, replacing <some_id> with the id you copy in a previous step.

Call("delete_firewall_rule", "<some_id>")

Choose Run Query to call your UDF, and Fauna removes the document from your database. You should have three documents in your firewall_rules collection and four UDFs — one for each CRUD primitive.

At this point, you should modify your client code to call the UDFs you create and avoid all calls to the FQL primitives. You have not changed any functionality in your application, but you are now prepared to make changes safely and perform migrations!

Migrating in steps

Migrating in steps reduces the risk of each stage of your migration.

  1. Create a UDF that accepts a reference to a previous version of an object and updates it to the new format.
  2. Modify the four UDFs you create in the previous section to call the migration function.
  3. Create another UDF that verifies that your migration was successful. This is similar to a unit test of your migration UDF.
  4. Remove any fields that you have deprecated with your migration.

Populating new field values

In the previous post, you learn that the first step in a migration is creating a UDF that populates the value of your new field from existing data according to your business logic. In this example, the business rule is to convert the string stored in ipRange to an array with one element, the existing value.

Create a UDF named “migrate_firewall_rule” with the following FQL code as the body.

Query(
Lambda(
["firewall_rule_ref"],
Let(
{
doc: Get(Var("firewall_rule_ref")),
ipRange: Select(["data", "ipRange"], Var("doc"))
},
If(
IsArray(Var("ipRange")),
Var("doc"),
Update(
Ref(Var("firewall_rule_ref")),
{ data: { ipRange: [Var("ipRange")] } }
)
)
)
)
)

The UDF accepts a reference, retrieves the specified document, and extracts the ipRange field. It determines whether the ipRange field is already an array. If so, it returns the unmodified document. If not, it updates the value of the ipRange field in the document to an array consisting of one element, the current value of ipRange. The IsArray() check makes this function idempotent. You always receive the same result no matter how many times you apply it to a document.

Note that this function accepts a Fauna reference as a parameter, not a string id. This function is invoked inside Fauna from other UDFs, so using the Fauna data type simplifies both writing and calling the function.

Updating your UDFs

After you save your migration UDF, you must update the UDFs you create in the previous section. The order of operations in each function differs based on the nature of the operation.

Create

You have two options when modifying your create_firewall_rule UDF. You can check the input format, make the value of ipRange an array if it is a string, and then create a document with the new info. However, this is awkward for large documents with nested objects, and duplicates the functionality you create with migrate_firewall_rule.

A simpler approach is to write the object as is and then apply the migration. Because Fauna is schemaless, you have the flexibility to perform both tasks in a single transaction, without restrictions on field types.

Replace the body of your create_firewall_rule UDF with the following FQL.

Query(
Lambda(
["new_rule"],
Let(
{
doc: Create(
Collection("firewall_rules"),
Var("new_rule")
)
},
Call(
"migrate_firewall_rule",
Select(["ref"], Var("doc")
)
)
)
)

This function works even when the provided document is in the new format because migrate_firewall_rule is idempotent. You can verify this by calling your UDF from the web shell.

First, invoke your function passing a string for ipRange.

Call(
"create_firewall_rule",
{
data: {
action: "Deny",
port: 25,
ipRange: "0.0.0.0/0",
description: "Unencrypted SMTP"
}
}
)

Note that the returned value is the updated version of your document with an array as the value for ipRange, even though you provided a string. You can verify that the document is stored this way on the Collections tab.

Return to the Shell (>_) tab and invoke your function again, this time with an array for ipRange:

Call(
"create_firewall_rule",
{
data: {
action: "Allow",
port: 21,
ipRange: ["10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16"],
description: "Allow FTP within private subnets"
}
}
)

Your function succeeds, and stores your document exactly as provided.

Retrieve

Two characteristics of your migrate_firewall_rule UDF make adding the migration to your get_firewall_rule UDF simple.

  1. The migrate_firewall_rule UDF is idempotent, so you can apply it to a document any number of times and get the same result.
  2. The migrate_firewall_rule UDF returns the document in the updated format, and all UDFs return the value of the last statement in the function.

Taken together, this means that your get_firewall_rule UDF is written as a wrapper to your migrate_firewall_rule UDF, replacing the application-friendly id field with its associated Fauna-native reference.

Replace the body of your get_firewall_rule UDF with the following FQL.

Query(
Lambda(
["id"],
Call(
"migrate_firewall_rule",
Ref(Collection("firewall_rules"), Var("id"))
)
)
)

Select the Collections tab and copy the id fields from two documents: one with a string value for ipRange and one with an array value for ipRange. Return to the web shell and invoke your get_firewall_rule twice, replacing <string_id> and <array_id> with their respective values.

Call("get_firewall_rule", "<string_id>")
Call("get_firewall_rule", "<array_id>")

Note again that each function call returns a document with an array value for ipRange. Return to the Collections tab and verify that both documents now have array values for ipRange.

Update

Updates in migrations work similarly to creates. First, make the provided changes to your document, then call the migration function to ensure the document is in the correct format.

Replace the body of your update_firewall_rule UDF with the following FQL.

Query(
Lambda(
["id", "new_rule"],
Let(
{
ref: Ref(Collection("migrate_firewall_rule"), Var("id")),
doc: Update(
Var("ref"),
Var("new_rule")
)
},
Call("migrate_firewall_rule", Var("ref"))
)
)
)

Select the Collections tab and copy the id from a document with a string value for ipRange. Return to the web shell and invoke your update_firewall_rule function, replacing <some_id> with the value you copied.

Call(
"update_firewall_rule",
[
"<some_id>",
{ data: { description: null } }
]
)

This query removes the description field from the document while the migration UDF updates the value of ipRange to be an array.

Delete

Deletes are different from the other three operations. In order to migrate the data in a document during a delete, you must first apply the migration and then delete the document.

Replace the body of your delete_firewall_rule UDF with the following FQL.

Query(
Lambda(
["id"],
Let(
{
ref: Ref(Collection("migrate_firewall_rule"), Var("id")),
doc: Call("migrate_firewall_rule", Var("ref"))
},
Delete(Var("ref"))
)
)
)

Select the Collections tab and copy the id fields from two documents: one with a string value for ipRange and one with an array value for ipRange. Return to the web shell and invoke your delete_firewall_rule twice, replacing <string_id> and <array_id> with their respective values.

Call("delete_firewall_rule", "<string_id>") Call("delete_firewall_rule", "<array_id>")

Verify that both function calls return the document as it existed at the time of deletion with array values for ipRange.

Why should you migrate a document if you’re deleting it anyway? If you do not access your database via GraphQL, a migration is strictly optional. However, it is considered a best practice to modify before delete since it provides a consistent return object shape and better supports temporality. Migrating delete functionality is required, however, if you access your database via GraphQL. The next post in this series addresses the reason for this requirement.

Confirming zero defects

In this example, you overwrite the existing field of a document with a new value rather than create a new field. How do you compare the new value to the previous value and ensure correctness in your migration?

Fauna provides temporality features that enable you to compare the history of a document at different points in time. Fauna retains thirty days of history data on your collections by default. This value can be increased or decreased both when a collection is created and at a later point in time.

The following FQL uses temporality to compare the values of ipRange in the same document at two points in time: the time of the last update, and one millisecond prior. It checks that the current value is an array, the previous value is a string, and the current value is equal to an array with one element, the previous value.

Note that this particular function only handles the happy path of a simple migration. It assumes that the tested document exists and was migrated in the last update without a change to the value of ipRange, only its format.

Create a UDF named “validate_migration” with the following FQL code as the body.

Query(
Lambda(
["ref"],
Let(
{
new_doc: Get(Var("ref")),
new_ts: Select(["ts"], Var("new_doc")),
old_ts: Subtract(Var("new_ts"), 1),
old_doc: At(Var("old_ts"), Get(Var("ref"))),
new_ipRange: Select(["data", "ipRange"], Var("new_doc")),
old_ipRange: Select(["data", "ipRange"], Var("old_doc"))
},
And(
IsArray(Var("new_ipRange")),
IsString(Var("old_ipRange")),
Equals(
Var("new_ipRange"),
[Var("old_ipRange")]
)
)
)
)
)

Select the Collections tab and copy the id fields from two documents: one that you migrated from a string ipRange and one you created with an array ipRange. Return to the web shell and invoke your validate_migration twice, replacing <string_id> and <array_id> with their respective values.

Call(
"validate_migration",
Ref(Collection("FirewallRule"), "<string_id>")
)
Call(
"validate_migration",
Ref(Collection("FirewallRule"), "<array_id>")
)

Verify that calling with <string_id> returns true and calling with <array_id> returns false. You can also manually modify the value of a document and note its impact on the result of the validate_migration function.

Removing deprecated fields

Instead of migrating the value of ipRange, you could choose to create a new field ipRangeList and leave the existing field unmodified. In this case, the next step is to update all documents with a value for ipRange and set it to null, remove the field.

This doesn’t apply to the current migration because you overwrite the existing field. More information on this topic is provided in the final post in this series.

Security

Note that without using the migrate_firewall_rule UDF your retrieve function would need permission to write to your collection. This violates the separation of concerns and least privilege principles. By offloading migration responsibilities to a separate UDF, your business logic UDFs only need permission to call the migrate_firewall_rule.

Conclusion

UDFs are a powerful tool for decoupling your business logic from your migration logic. UDFs enable a number of additional techniques, including fine-grained access to resources and comparing different versions of documents at call time.

The next post in this series shows you how to perform migrations when you access Fauna via the Fauna GraphQL API. If you are not a GraphQL user, the final post in the series provides code and considerations for migrating your data and indexes.

Originally published at https://fauna.com.

--

--

Rob Sutter
The document-relational database blog

Builder of cloudy things, usually (but not always) serverless, entrepreneur, former diplomat (but still diplomatic… mostly)