Data modelling: comparing Realm and SQLite

Kenneth Geisshirt
Realm Blog

--

Introduction

In this blog post, we will compare data modelling of two popular databases used for mobile applications: SQLite and Realm.

SQLite is preinstalled on any Android and iOS device, while Realm requires a deliberate decision by the app developer to include it. We will use SQLite in its raw form i.e., without any programming language other than SQL. It means that the object-relational impedance mismatch is not considered. For SQLite, many Object-Relational Mappers exist with ROOM (Android) and Core Data (iOS) as highly popular examples. For Realm, the JavaScript SDK is selected, as it will make it possible for you to run the examples on your computer without going through setting up a mobile app.

We will model a group of people and their life as they get married, have children, are divorced and die. The source code for both SQLite and Realm is found at https://github.com/kneth/data-model-compare.

The group of people we model are called Parent objects (or rows). They can be married to another Parent object, and they can have zero, one or more children. The children are modelled as a Child object (or row).

Two data models

In SQL, both the Parent and Child can be modelled using tables which are straightforward.

CREATE TABLE Parent (
name VARCHAR PRIMARY KEY NOT NULL,
city VARCHAR NOT NULL,
marriedTo VARCHAR REFERENCES Parent(name) NULL
);

CREATE TABLE Child (
name VARCHAR PRIMARY KEY NOT NULL,
age INTEGER NOT NULL
);

The most advanced part is the self-referential column in Parent. The tricky part is the modelling of the relationship between Parent and Child. We will need to introduce an auxiliary table, Parent_Child, to capture who is the parent to which child. Such a join-table is common in SQL, and by using foreign keys we can link the two tables together.

CREATE TABLE Parent_Child (
parent_name VARCHAR NOT NULL,
child_name VARCHAR NOT NULL,
CONSTRAINT fk_parent
FOREIGN KEY (parent_name)
REFERENCES Parent(name)
ON DELETE CASCADE,
CONSTRAINT fk_child
FOREIGN KEY (child_name)
REFERENCES Child(name)
ON DELETE CASCADE,
UNIQUE (parent_name, child_name)
);

On the other hand, Realm is an object database and objects can be tied together by links or references. Moreover, it is possible to have a list of links which we can use to model a Parent having multiple Child objects.

const Parent = {
name: "Parent",
primaryKey: "name",
properties: {
name: "string",
city: "string",
children: "Child[]",
marriedTo: "Parent",
}
};

const Child = {
name: "Child",
properties: {
name: "string",
age: "int",
parents: {
type: "linkingObjects",
objectType: "Parent",
property: "children"
},
}
};

Links in Realm are directional, and you can think of your Realm database as a set of directed graphs where the nodes are objects. In order to be able to query a link in the opposite direction, Realm introduces the notion of linkingObjects. As you can see, an auxiliary table is not required as Realm’s native storage is based on objects. It is important to note that linkingObjects properties are updated automatically by the Realm database.

A short-coming of both data models is that it is technically possible for a child to have more than two parents. To ensure that a child only has one or two parents will be left to the application’s logic.

Getting married

A database is only interesting if it contains some data. Initially, we will add four persons and they form two married couples. First, we have to create the rows or objects and thereafter update who is married to whom. Neither data model can handle the constraint that marriage is a symmetric property, and we leave it to the application developer to ensure that data is consistent.

For SQL, we insert new rows and update the relevant column.

BEGIN TRANSACTION;

/* Alice and Bob are married */
INSERT INTO Parent ("name", "city") VALUES ("Alice", "Stockholm");
INSERT INTO Parent ("name", "city") VALUES ("Bob", "Stockholm");
UPDATE Parent SET marriedTo = "Bob" WHERE name = "Alice";
UPDATE Parent SET marriedTo = "Alice" WHERE name = "Bob";

/* Charlie and Dorothy are married */
INSERT INTO Parent ("name", "city") VALUES ("Charlie", "Helsinki");
INSERT INTO Parent ("name", "city") VALUES ("Dorothy", "Helsinki");
UPDATE Parent SET marriedTo = "Dorothy" WHERE name = "Charlie";
UPDATE Parent SET marriedTo = "Charlie" WHERE name = "Dorothy";

COMMIT TRANSACTION;

Similar operations are needed for Realm.

realm.write(() => {
// Alice and Bob are married
const alice = realm.create(Parent.name, { name: "Alice", city: "Stockholm" });
const bob = realm.create(Parent.name, { name: "Bob", city: "Stockholm "});
alice.marriedTo = bob;
bob.marriedTo = alice;

// Charlie and Dorothy are married
const charlie = realm.create(Parent.name, { name: "Charlie", city: "Helsinki" });
const dorothy = realm.create(Parent.name, { name: "Dorothy", city: "Helsinki "});
charlie.marriedTo = dorothy;
dorothy.marriedTo = charlie;
});

One major difference between SQL and Realm is that Realm operates with managed objects. For example, the object alice is a representation of data in the database, and the object behaves as a JavaScript object (if you are using another Realm SDK, the managed object will behave like a native object for the specific programming language). When you change the object, the database is changed. Later we will see that the implication is that you do not have to populate JavaScript objects from query results.

Kids arrive

The two couples continue their lives, and children are born. In database terminology, birth means that rows or objects are created. While creating new rows is almost trivial, keeping track of parenthood is not trivial in SQL. We need to update the join-table in order to record the relationship between parents and children. Until we insert rows into the join-table, we have orphaned children (the atomicity of the transaction will prevent it from happening).

BEGIN TRANSACTION;

/* Alice and Bob */
INSERT INTO Child (name, age) VALUES ("Alma", 14);
INSERT INTO Child (name, age) VALUES ("Bill", 12);
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Alice", "Alma");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Alice", "Bill");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Bob", "Alma");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Bob", "Bill");

/* Charlie and Dorothy */
INSERT INTO Child (name, age) VALUES ("Charlotte", 5);
INSERT INTO Child (name, age) VALUES ("Dillon", 8);
INSERT INTO Child (name, age) VALUES ("Chuck", 11);
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Charlotte", "Charlie");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Dillon", "Charlie");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Chuck", "Charlie");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Charlotte", "Dorothy");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Dillon", "Dorothy");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Chuck", "Dorothy");

COMMIT TRANSACTION;

For Realm, the children are modelled as a list of links (or an array of references). With managed objects, we can simply append (in JavaScript, Array#push() appends a new object to an array and Realm tries to be as idiomatic as possible) references to the children property. The managed objects for the parents are found in the database by a look up by primary key (Realm#objectForPrimaryKey()). Realm will automatically update the links in the opposite direction i.e., the parents property does not need to be handled by the application.

realm.write(() => {
// Alice and Bob
const alice = realm.objectForPrimaryKey(Parent.name, "Alice");
const bob = realm.objectForPrimaryKey(Parent.name, "Bob");

const alma = realm.create(Child.name, { name: "Alma", age: 14 });
const bill = realm.create(Child.name, { name: "Bill", age: 12 });
[alma, bill].forEach(c => {
alice.children.push(c);
bob.children.push(c);
});

// Charlie and Dorothy
const charlie = realm.objectForPrimaryKey(Parent.name, "Charlie");
const dorothy = realm.objectForPrimaryKey(Parent.name, "Dorothy");

const charlotte = realm.create(Child.name, { name: "Charlotte", age: 5 });
const dillon = realm.create(Child.name, { name: "Dillon", age: 8 });
const chuck = realm.create(Child.name, { name: "Chuck", age: 11 });
[charlotte, dillon, chuck].forEach(c => {
charlie.children.push(c);
dorothy.children.push(c);
});
});

Splitting up

In the history of our two families, Alice and Bob decided to split up and divorce. For both SQL and Realm, this means that the marriedTo column/property has to be updated.

BEGIN TRANSACTION;

UPDATE Parent SET marriedTo = NULL WHERE name = "Alice";
UPDATE Parent SET marriedTo = NULL WHERE name = "Bob";

COMMIT TRANSACTION;

For Realm, we first find the object, and then update it.

realm.write(() => {
const alice = realm.objectForPrimaryKey(Parent.name, "Alice");
const bob = realm.objectForPrimaryKey(Parent.name, "Bob");

alice.marriedTo = null;
bob.marriedTo = null;
});

Meeting a new partner

Eventually, Alice will meet a new partner, Eric, and get married. At the same time, Alice moves to Eric’s hometown. Moreover, together they have a child. In SQL, the marriage and the move can be done in a single operation, UPDATE.

BEGIN TRANSACTION;

INSERT INTO Parent (name, city, marriedTo) VALUES ("Eric", "Oslo", "Alice");
UPDATE Parent SET city = "Oslo", marriedTo = "Eric" WHERE name = "Alice";

INSERT INTO Child (name, age) VALUES ("Ellen", 1);
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Alice", "Ellen");
INSERT INTO Parent_Child (parent_name, child_name) VALUES ("Eric", "Ellen");

COMMIT TRANSACTION;

Alice’s move requires two updates in Realm, but otherwise it is not much different from SQL.

realm.write(() => {
const alice = realm.objectForPrimaryKey(Parent.name, "Alice");

const eric = realm.create(Parent.name, { name: "Eric", city: "Oslo", marriedTo: alice });
alice.marriedTo = eric;
alice.city = "Oslo";

const ellen = realm.create(Child.name, { name: "Ellen", age: 1 });
alice.children.push(ellen);
eric.children.push(ellen);
});

As time goes by

In our data model we track the age of the children. As time goes by, we need to update it. Adding 15 years to the age is simple in SQL.

BEGIN TRANSACTION;

UPDATE Child SET age = age + 15;

COMMIT TRANSACTION;

With Realm, we will query the database for all Child objects, and iterate through them. Fortunately, Realm is lazy-loaded so we will only load the objects we access (in this case we access them all).

realm.write(() => {
realm.objects(Child.name).forEach(c => c.age += 15 );
});

We don’t live forever

In our data model, death is modeled by deleting rows or objects. In our family history, a sad moment is when Alice dies. In SQL, our model does not guarantee integrity of the database, and we need to update Eric is no longer married.

BEGIN TRANSACTION;

DELETE FROM Parent WHERE name = "Alice";
UPDATE Parent SET marriedTo = NULL WHERE marriedTo = "Alice";
DELETE FROM Parent_Child WHERE parent_name = "Alice"; /* enforce foreign key constraints is off by default */

COMMIT TRANSACTION;

Deleting an object in Realm will lead to an update of links where the object is associated with. The implication is that we only have to delete the object representing Alice, and the marriedTo property of the object representing Eric will be set to null.

realm.write(() => {
realm.delete(realm.objectForPrimaryKey(Parent.name, "Alice"));
});

Children with one parent

To end our story, we would like to know who has only one living parent. It turns out that it is complicated to query SQL. Our way is to combine a subselect and a join to find the three children (Alma, Bill, and Ellen).

SELECT Child.name, age
FROM Child
JOIN Parent_Child
ON Child.name = child_name
WHERE parent_name IN
(SELECT parent_name
FROM Parent_Child
GROUP BY child_name
HAVING COUNT(parent_name) = 1);

On the other hand, Realm can directly query it using the linkingObjects property in the data model.

realm.objects(Child.name)
.filtered("parents.@count = 1");
.forEach(c => console.log(`${c.name} (${c.age})`));

Conclusion

All databases have advantages and disadvantages. SQL’s major advantage is that it is governed by an ISO standard, and you can leverage your knowledge from one application to the next. If you have used Microsoft SQL Server, you can apply many of the same tricks to your mobile app’s SQLite database. What makes SQL hard to work with is that you need to normalize your database in order to take full advantage of it.

In the example above, we have not considered how you populate JavaScript objects from SQL queries. You can automate it by using an ORM, but many ORMs will require that you specify SQL queries or annotate classes to access the links.

Realm is an object database, and the major advantage is that it is closer to your programming language. Modern software development — also for mobile applications — is based on object-oriented programming, and working with objects makes it easier to get started.

Objects in Realm are lazy-loaded, and they are direct representations of data in the database. It often gives you lower memory consumption and faster access. Furthermore, the Realm Query Language is optimized for querying objects, as the last example clearly shows.

--

--

Kenneth Geisshirt
Realm Blog

Chemist by education, geek by Nature, lead engineer at MongoDB