SQLite: Keeping Your Tables Organised
Keeping your data organised is like keeping a tidy workplace. It is easier to know where things are and frees more space either for storage or for work.
Currently, the database world is divided into SQL database libraries and NoSQL libraries with their own advantages and disadvantages. We’ll be focusing on the SQL counterpart of databases in this story.
We are going to see how to create tables using the SQLite library and how to minimise the insertion of wrong data by providing additional information at the moment of constructing a table and after it is created. Before getting into it, let me try and give you an overview of what this SQLite library is.
SQLite
SQLite is a relational database based on the SQL language. Being relational means that within the database there are tables that may relate to each other through common attributes. These tables are two dimensional arrays of information having rows or records and columns or attributes.
You can see below an example table with three records and three attributes containing different types of data. This demonstration is facilitated by DB Browser for SQLite application, from which I have no affiliation.
SQL is a standardised language [1]. However, it is up to the developers to comply or not to the standards to a certain extent depending on how it is meant to be applied. SQLite is one such implementation and, unlike MySQL or PostgreSQL, is a simpler library created to be used locally and without the need of external servers [2].
SQLite’s Serverless feature enables the existence of other features like Self-contained meaning that it doesn’t need external resources making it usable in any environment and Zero-configuration because there is no server to configurate [3].
SQLite is also Transactional implying that all changes either take place completely or don’t [3]. Transactions must abide by a set of rules called ACID.
A stands for Atomic meaning that changes happen as a whole and cannot be separated, C stands for Consistent suggesting that the database state must remain stable after the transactions are commited or rolled back, I stands for Isolated indicating that transactions are only visible in the current session and D stands for Durable meaning that changes must be permanent [3].
I am going to let my nerd self take over and create a database to containing race and subrace information from Dungeons and Dragons since it is somewhat different from traditional examples and is something I have fun playing with my friends.
Let’s start by creating the database using these SQLite commands in the command prompt by changing the directory into the file we want the database to be and then create the database by activating the sqlite3 library and choosing a name for it:
cd C:\Projects\SQLite
C:\Projects\SQLite>sqlite3 DnD.db
SQLite is flexible with the extensions you can use for your database file. You can use .db
, .sqlite
or .sqlite3
without complaints. We can now create our first table.
Creating a table
Tables are the structures that hold on to the data that is inserted into SQL-like databases. Bellow is a generic representaion of the code to create a table:
We start by typing the command CREATE TABLE
followed by the name we choose for the table in table_name
. Optional code is enclosed in square brackets. The IF NOT EXISTS
command is used so that SQLite may check for any existing table with the name provided before creating a new one.
Inside the round brackets we provide attribute or column names, their data types and constraints at an attribute or table level. Despite being optional, data types and constraints should be used to help the user understand what kind of information is stored in each attribute.
We’ll get into what data types and constraints are later. For now understand that they exist and they are defined at the moment of creating a table.
The semi-colon tells SQLite that our piece of code is complete and it can be processed. Let’s create a new table that will hold information for different races called races.
To give you some context about D&D, race_name is the name of the a given race like Elf, Gnome or Dwarf (I’ll be using slightly more exotic examples here), ability_score_increase contains the ability scores which a race is naturally better at, for instance, an Aasimar is more charismatic, so its Charisma will go up by two points.
The attribute size is how big a race is on average, generally they are either small or medium sized. speed is the maximum distance a race can run in a round of combat, languages are the idioms a race speaks and age is the average age a race lives.
race_id is the number I decide to attribute to the race as an integer identity, this will be important later when we talk about constraints.
I know it is a lot to take in for someone that has never heard or played this tabletop game, but you don’t really need to understand what the attributes mean, what is important are their data types and constraints. To populate the table let’s write an INSERT
statement with three records:
You can see we pass the name of the table as well as its attributes where we are inserting values followed by the actual values we wish to insert. Here is the data stored in races:
Writing this much text in a command prompt can be annoying because if we make a typo the command prompt might not let us go back and correct it. SQLite offers the possibility of using a text file that you provide to it using the .read
command.
We can write the code to create the table, insert the values in the text file and check for any typos. If everything is correct, we save the file with any name we’d like and type it into the command prompt like .read filename
. Imagine we have the CREATE TABLE
and INSERT
statements above separated by a blank line in a text file named “races.txt”:
sqlite>.read races.txt
will automatically create the table and populate it.
Knowing how to create a table and insert values in it, we can go ahead and understand what data types and constraints are and how they help us identify and mantain what information is in a table. Let’s start with data types.
Data types
SQLite supports five main data types, which are defined at the moment of creating a new table being TEXT
, INTEGER
, REAL
, GLOB
and NULL
. You can see above that the races table was created to contain TEXT
and INTEGER
data types.
Generally speaking, TEXT
represents text-like information inveloped in quotes or strings. In the races table, race_name attribute is set to store TEXT
data and the strings “Aasimar”, “Kalashtar” and “Genasi” populate that attribute.
INTEGER
represents whole numbers either positive or negative with sizes from 1 to 8 bytes. speed is a good example of where to use integers because the number of feet a character can run in a combat round can be represented as an integer number.
REAL
is a data type that contains numbers that are either more than 8 bytes long or decimal and exponential numbers, for instance 1.5 or 2⁴.
We use NULL
to cover information that is missing or is unkown. Let’s say I couldn’t find which languages a Genasi speeks, I can keep that value as NULL for the time being.
Lastly, there is BLOB
. It stands for ‘Binary Large Object’ and can store any kind of data. Neither REAL
nor BLOB
data types are used in this dataset.
Datetime data isn’t supported by SQLite, but its information could still be stored as TEXT
, REAL
or INTEGER
provided the correct transformations. For instance, timestamp data can be stored as INTEGER
.
Beware that SQLite, unlike other SQL libraries, accepts different data to be stored in attributes whose data type is predefined. For instance, defining an attribute as containing TEXT
data, won’t prevent SQLite from accepting integers or any other type of data.
Even so, defining a data type gives the user a strong hint in what kind of data should be stored in each attribute allowing it to keep tables organised.
Constraints
Constraints help set boundaries in attributes, default values or relationships between tables. Starting with the latter, there are two constraints that play an important role in what is called normalisation. Their names are PRIMARY KEY
and FOREIGN KEY
.
Before getting into more detail about these two constraints, let’s create a new table that contains a relationship to races called subraces to understand what normalisation is.
As you can see, subraces contains a race_id attribute that accepts INTEGER
values and is referenced to the race_id column in the table races through line FOREIGN KEY (race_id) REFERENCES races (race_id)
.
By creating this reference, we can store subgroup information while repeating information as little as possible within the database. Let’s take a look at the values in the subraces table:
Avoiding the normalisation, we would have to create the table like races_subraces with the following code:
And the following values:
It may not look like much, after all there aren’t too many attributes and records in it, but you can see that there is a reasonable amount of values that repeat themselves, namely in the age, languages, speed, size and race_name attributes. Imagine the amount of space saved for big databases with thousands of records and attributes that are interconnected.
So, in a scheme, the relationship between races and subraces is:
But we still haven’t talked about what ON DELETE CASCADE
and ON UPDATE NO ACTION
mean.
Basically the first means that if I delete the “Genasi” race (of race_id = 3) from races, for instance, the subraces with race_id of 3 in subraces will automatically be deleted by SQLite.
The latter is necessary because if I changed any record in races that was connected to a foreign key in the subraces table, for instance, updating the languages attribute for “Genasi”, subraces would become completely empty.
It is also possible to define various attributes as PRIMARY KEY
to create more complex relationships between more than two tables:
Another constraint that you may have noticed is NOT NULL
. This constraint forces attributes to not accept NULL values. Identity columns are a good example of when to use NOT NULL
. However, it can be used in every attribute whose values the user decides shouldn’t be unknown nor missing.
However, just because an attribute of a record doesn’t have a value, does’t mean NULL should fill that gap.
Note the string “None” in subrace_name column of races_subraces where race_name is “Kalashtar”. NULL wouldn’t make sense here because I know Kalashtars don’t have subraces. It is neither missing nor unknown information.
If the user decides that an attribute should contain distinct values for every record, it should use the constraint UNIQUE
. This way, SQLite will raise an error when inserting a new row with a duplicate value in a column with the UNIQUE
constraint.
Note that the races_name column in the races table has this constraint, which makes sense given that different races should have unique names for better distinction.
UNIQUE
can be defined at a column level (on the top) or at a table level (on the bottom):
Moving on to the next constraint, CHECK
allows the user to define an expression that mustn’t be violated by an inserting value.
Picking the races table, I want to make sure that the values inserted into the speed attribute are always two digits long since there is no race with a speed greater or smaller. Let’s drop the existing races table and create a new one with the CHECK
constraint:
Now let’s try to insert a row whose speed attribute is 300:
An error will be raised. That’s because the CHECK
constraint was violated and that new row of values wasn’t inserted into the table.
Similar to UNIQUE
, CHECK
can be defined at a column (top) or table (bottom) level:
There is one last constraint: DEFAULT
. There is one language that every race understands, which is “Common”. We can add the constraint DEFAULT
, set it to “Common” and insert a row without specifying the languages attribute.
You can see that there is no value passed for the language attribute. However, by defining “Common” as a default language, we get the following result:
These are the constraints accepted by SQLite. As you can see, they have the potential to help the user to create interactions between simpler, easy to read tables and to control to a certain extent the data before being inserted into the tables. Let’s now go outside the tables and see how we can use the CREATE TRIGGER
command to create boundaries.
Triggers
Triggers execute user defined actions automatically whenever a row is inserted, deleted or updated on the table that is associated with it. They enable the user to perform more complex control of the values that go in and out of tables. The general syntax to create a trigger is as follows:
First, we create the trigger and provide a name for it. Then, we decide when is the TRIGGER
going to fire. It may be before or after either of the three events INSERT
, UPDATE
or DELETE
.
We need to specify a table on which this trigger will act followed by an optional trigger condition after the WHEN
clause. Finally we specify the trigger logic between BEGIN
and END
.
Depending on the event, accessing an attribute can be done using the OLD
and NEW
references in the form of OLD.column_name
and NEW.column_name
.
For the event of INSERT
, we use NEW
beause a new row is being inserted. For UPDATE
, we use both NEW
and OLD
because we update an old row or value by a new one. For the DELETE
action, we use OLD
because we delete an old row or value.
Going back to the subraces table, we can see in the ability_score_increase attibute that its values are strings that can be emulated by “___ +1” with three underscore wildcards.
Wildcards are replacing caracters. There are two wildcards: the underscore “_” which replaces only one character and the percent “%” which replaces any number of characters. For example, if we are looking for tables using the .table
command, when typing .tables “_aces”
, SQLite will find us the table races, but typing .table “%aces”
will find us the tables races, subraces and races_subraces.
There are six ability scores: Strength, Dexterity, Constitution, Inteligence, Wisdom and Charisma. Three physical stats and three psicological stats that help differentiate characters between races and subraces. These scores can be shortened to Str, Dex, Con, Int, Wis and Cha.
The trigger we are going to create will make use of the string with the three underscore wildcards to focus on the shortened names. So, if we were to pass “Charisma +1” instead of “Cha +1" the trigger will abort and raise an error saying “Ability score is not shortened.”.
If we don’t need the trigger anymore, there are two ways to delete it. Either by dropping all the tables it is associated with or by dropping it directly:
With triggers, It becomes easier to minimise mistakes. You can get creative and find other ways of using them for other to control the flux of data. However, nothing is fail proof and there is yet another way of dealing with greater mistakes at a database level.
Backing up databases
Creating a backup should be encouraged everytime one manipulates a database because with the wrong statements one can easily delete entire rows or replace entire attributes that would take a great deal of time to correct.
To safeguard any unwanted changes in our database we have the option to backup and restore it using .backup backupfilename
and .restore backupfilename
. backupfilename
can have any name other than your database name.
Closing thoughts
We discussed about what SQLite is, how to create tables and how to keep a correct storage of data.
We started by giving some background mentioning SQL as a standard. We also talked about the benefits of not needing to configure a server to use SQLite and the rules it is based on when making a transaction.
We proceeded to talk about the creation of tables, how to define the attributes and set data types and constraints to them.
We also discussed about the general data types SQLite uses to store information. We saw that there isn’t a way for SQLite to store datetime data but that can be easily overcome using the general data types.
Then we moved on to constraints. They add functionality to tables either by normalisation with keys that interconnect tables, setting default values, checking inserting values with an expression or not allowing NULL values.
We also discussed how triggers help mantaining a table by allowing a deeper auditing of what is to be changed in it. They can be used before or after the moment of inserting, updating and deleting whatever rows or values we want to.
Finally we briefly mentioned how to back up and restore a database.
If you are interested in knowing more, I recommend checking this website for a more thorough understanding of SQLite.
Thanks for reading and I hope you learned something with this story!
References
[1] https://en.wikipedia.org/wiki/SQL