Beginner’s Guide to Database Management Systems: Introduction
I am a Computer Engineering undergrad student, currently in my 5th semester and I’ve always felt fascinated by DBMS. In this article, I will try to share my learning in the simplest way I can :)
This article is the first of many. Over the coming weeks, I will cover Database Management Systems as thoroughly and exhaustively as possible.
Let us begin with the definitions.
A database is simply a collection of data whose function is to efficiently and reliably organize the data stored in it. For instance, a college database will store information related to students, teachers, courses, etc.
A database management system is a software that allows the user to manage, modify, and maintain the data stored within the database. The purpose of a DBMS is to provide users with an abstract view of the data they are dealing with. Hence, some of the information like where the data is stored, or how the data is stored or accessed is hidden. MySQL and Oracle are some well-known examples of DBMS.
Types of DBMS:
We can divide the database management systems into three major categories:
- File System
- Relational Database Management Systems
- NoSQL, or Not Only Sequel.
However, these three are not the only types of databases. There are others like Network Databases, Object-Oriented Databases, etc. In an ideal situation, clients and companies tend to focus on creating their own database, which is tailored to their needs.
A File Management system is a type of DBMS that allows access to single files or tables at a time. In a File System DBMS, information is stored directly in a set of files, which are usually stored on the hard disk. Users may create, update, and delete the data accordingly. File System is the oldest example of a DBMS, and hence comes with its fair share of issues:
- Difficult Access: To access a particular piece of data, a user must know in which file the data is stored. Additionally, they must also know the exact location of the file.
- Redundancy: Redundancy simply means duplication of data. If the same data is present in too many files, then that data is said to be redundant. Although this increases the size of our database, redundancy can improve look-up times, but that is a topic for another article. If data is redundant, then a user will have to change it in every single file, leading to a long and tedious process.
- Inconsistency: With redundancy, comes inconsistency. The data is said to be inconsistent when copies of it do not match with each other. For example, if a student is changing his phone number in a University DBMS, and he forgets to update one of the files, then inconsistency will be introduced.
- Lack of concurrency: In a file-based DBMS, only one user can access the data at a time. This leads to problems and big wait times in large-scale systems.
These problems, along with lack of security and backups, motivated a shift to a more advanced and secure Database Management System.
A relational database means the data is stored and accessed in the form of relations or tables. For example, consider the student table below:
| Roll No. | Name | Age | Grade |
| 1 | Bruce Wayne | 26 | A |
| 2 | Gandalf the Grey | 24,000 | A+ |
| 3 | Harry Potter | 17 | A+ |
| 4 | James T. Kirk | 49 | A |
The student entity here has four attributes: Roll No., Name, Age, and Grade. The relational DBMS also provides us with a “Structured Query Language”, to create, access, modify, and delete the same. This is called SQL and has several variations, like MySQL, Microsoft SQL Server, PostgreSQL, etc. Although SQL has a standard, these distributions vary slightly in syntax. So we might encounter queries that will work in MySQL but not in SQL server.
NoSQL is a non-relational DBMS, that does not require a fixed schema/structure, avoids joins, and is easy to scale. The purpose of using a NoSQL database is for distributed data stores with extremely large data storage needs. Some common implementations include:
- Document-based: MongoDB, Cloud Firestore
- Key-value stores: Amazon SimpleDB, Redis
- Column Family stores: Cassandra, HyperTable
- Graph: OrientDB, FlockDB.
A NoSQL database includes simplicity of design, simpler scaling and finer control over availability. These are used in real-time web applications and big data. The reasons for the rising popularity of NoSQL Databases are:
- Different forms of data with different attributes and structures can be very easily handled.
- Easy scalability.
- A faster pace of development as compared to SQL servers.
- Handle large volumes of data at high speed.
Companies like Google, Amazon, & Twitter collect terabytes of user data every single day. As such, a slow-performing and less scalable RDBMS will not be the optimal choice for them.
Database Management System is a collection of related data and a set of programs to access those data. The purpose of DBMS is to provide users with an abstract view of the data. Hence the system hides certain details of how the information is stored and maintained.
These are a few types of databases that explain the fundamental concepts. As mentioned earlier, clients tend to focus on creating databases that would suit their exclusive needs.
Thank you for reading! I will see you in the next article :)