15 SQL Terms That Every Data Analyst Should know

Hassan Faheem
ILLUMINATION
Published in
7 min readNov 30, 2022
Image made in Canva by Author — Image created behind a digital wall; © the author assumes responsibility for the provenance and copyright.

SQL, or Structured Query Language, is a database management programming language. It’s a language that’s been around since the 70s, and it’s the most common way to interact with databases like MySQL and Postgres.

While knowing how to code in SQL is not required, it can be beneficial if you are interested in data analysis and want to understand better how databases work.

This article post will look at some basic terms and concepts used when working with SQL databases.

1. Database

A database is a collection of data organized for easy retrieval, storage, and analysis. Databases can be structured or unstructured, relational or non-relational. Databases store and retrieve data from the internet or any other source. The term database has become synonymous with the concept of a computer file, especially since databases are now often stored on computers that are separate from the data they contain.

2. SQL

SQL is a programming language used to interact with databases. It’s very powerful and can perform complex tasks like querying data, performing calculations and analytics, designing tables and views, managing metadata, and much more.

SQL is especially useful for data analysts who need to manipulate large amounts of data to draw conclusions or make predictions.

3. Relational database

A relational database is a database that uses the relational model to organize data. Relational databases are the most commonly used and often found in enterprise applications and business-critical systems.

The relational model defines how data is stored and accessed and divides all information into tables. Each table contains rows of data related to a common field, such as name or address. In addition, each row has a unique primary key field that identifies each record uniquely within a table. The primary key can be based on several columns: integer, character string, date/time stamp, or binary image.

4. Database Management System (DBMS)

A database management system is a software program that allows users to create, store, update, and retrieve data from a database. It is the software that manages data in a database. It organizes and stores information in tables with rows and columns so authorized users, such as administrators or application developers, can quickly access it.

The DBMS also provides an interface for users to access the data stored within these tables via queries or requests.

5. Table

A table is a structured collection of data. A table consists of rows and columns. Each row represents an occurrence or record in the table, and each column has a unique name and stores different data types. Tables are the basic storage units in a database, so they should be defined before you can use them with any other SQL command.

A table contains data about one or more entities. The entities can be people or other living beings, organizations, inanimate objects, concepts, etc. The table also contains information about their relationships (for example, who is married to whom). This information is called metadata.

6. Column

A column is a grouping of data that has the same values. The values are also known as fields. A table can have many columns, and each column has a name. The data in all of the columns in a table must be related to each other. For example, if you wanted to store information about your customers, you could put their names, email addresses, phone numbers, and orders into four columns (rows).

7. Row

A row is a single record in a database table. It contains a collection of columns and can store information about one or more entities. For example, a single row of data would include the name, age, address, and other information about an individual.

8. Query

A Query is a simple statement that retrieves data from a database. It’s the most basic and standard operation in SQL and can be used for any number of things. For example, queries can be used to find information about one or more specific things like a person’s name or phone number, or they can be used to discover how many people reside in a particular city.

Query syntax consists of three parts: a SELECT statement, FROM clause, and a WHERE clause. The SELECT statement specifies what data you want to retrieve from your database; the FROM clause tells which table or tables you will use for this query, and the WHERE clause specifies what requirements must be completed to get the desired results.

9. Primary key

A primary key is a term for a column or group of columns that allows each row in a database table to be uniquely identified. A primary key can be either an integer or a character string. Every table needs one primary key to identify all of its rows uniquely.

Primary keys are often used as foreign keys in other tables, so having one is crucial for managing referential integrity. You can also use them for indexing purposes to improve the performance of queries and other operations on the table.

10. Foreign key

A foreign key is a term for a column or set of columns in a database table that references a unique identifier from another table. The referenced unique identifier is the primary key used to enforce referential integrity. That means you can’t enter data into one database table without it coming from another — a constraint that prevents data from being entered more than once, which leads to duplicate records (and incorrect data). Foreign keys can be either single-column or multiple-column; they must contain exactly as many columns as defined in the referenced primary key.

11. NULL value

A NULL value is a missing value in a database. If you want to know the variable’s value, you can’t use a variable that has a NULL value because it doesn’t have a value. You’ll need to use the ISNULL() function instead.

You can get a NULL value in two ways: if you don’t include it when creating your table or through an error. The most common error is using something other than an integer when creating a column or an error during the process of trying to insert data into your table.

12. Literal values

In SQL, a literal value is a value that has been “hard-coded” into the query. A literal value can be a string, number, or Boolean value. There are three types of literals in SQL: character strings, numeric values, and Boolean values.

Character strings: A character string is a set of characters wrapped by single or double quotation marks. The most common example of a character string is a name, address, or any other text you might want to store.

Numeric values: A numeric value is a number without decimals. They can be either positive or negative and include numbers with decimal points (e.g., 3.14159). You can also use scientific notation in your literal values by writing them as exponents (e.g., 1e-5).

Boolean values: A Boolean value can be either TRUE or FALSE. To create a boolean literal, write TRUE or FALSE without any quotation marks around it.

13. Indexes

An index is a separate structure that is stored along with the data. It contains a list of values and their locations in the table’s rows. An index can be used to select data from a table or to sort it.

A unique index ensures no duplicates exist in the column(s) it covers. This can be helpful when there are many rows in your table, and you want to find records based on some criteria (for example, “find me all customers who live in New York”). A non-unique index allows duplicates: if two or more rows have the same value for one or more indexed columns, they will appear as candidates when you try to retrieve them through an SQL query.

An index can be created on one or more columns within your SQL database tables; this makes it possible for these columns’ values not just as search criteria but also as sorting methods. It does this by storing pointers in each row’s location instead of having their actual values take up space on disk drives like most other pieces of information (like text).

14. Expressions

An expression is a set of values, variables, operators, and functions that can be evaluated to produce a value.

Expressions are used in SQL queries to perform calculations or comparisons of data. Expressions can be built up using the following elements:

  • Values: Numbers and strings.
  • Variables: Symbols representing column names.
  • Operators: Mathematical operators such as + (addition), — (subtraction), / (division), * (multiplication), etc.
  • Functions: Built-in functions such as AVG() and SUM().

15. Aliases

An alias is a name you can use to refer to a column or table. Aliases are useful for making your queries more readable, especially when working with long, repetitive column names.

To create an alias, type it into the SELECT statement after the column name. For example:

SELECT [column name] AS [alias name] FROM [table name];

Conclusion

Learning SQL is a great way to boost your career as a data analyst, and it’s also fun to spend your free time. But, of course, the more you practice, the better you will get. And once you’ve got the basics down, so many other things can be done with SQL.

In this guide, we’ve covered some of the most important terms related to SQL: what they mean and how they’re used in practice. I hope this guide helps you on your journey toward becoming an SQL expert!

--

--

Hassan Faheem
ILLUMINATION

Data Scientist in the making | Masters Degree in Data Science from Heriot Watt University