A2 SSD — Datebase Concepts

Francis
Revision Notes
Published in
3 min readMay 10, 2016

Using files instead of databases

  • To share data between applications or across organisations we may need to create programs to merge files which store data in a different format.
  • Changing the file format will require changes to the program that reads and writes the files.
  • Data may be duplicated which is a cause of inconsistency which affects data integrity.
  • If two or more users access the same files simultaneously / concurrently it is possible that the access will interfere with one another.

Data integrity is the validity and consistency of the data.

The Database Approach

A database has a single repository of data which reduces data duplication/redundancy and hence storage space is minimised and the risk of inconsistency is reduced.

Data sharing is easier which allows more users to derive meaningful information via access to more data.

Database is self-describing because it stores information about the structure of the data. Allows for a degree of data independence as minor modifications can be made without having to change the connecting applications.

Database is designed to meed the needs of an organisation.

A Database Management System (DBMS) contains programs to:

  • Define the database using a data definition language. Definition of database is stored separately from the actual data.
  • Insert, Update, Delete and Retrieve data using the ‘Data Manipulation Language’.
  • Users and programs may interface with the database via the DBMS, improving the productivity of users and programmers by removing the need to write complex code.

DBMS manages and controls access to data:

  • Security — Groups, usernames and passwords can restrict access.
  • Data Integrity — Erroneous data will now affect any users that share that data. Constraints (rules the database can’t violate) maintain the validity and consistency of data.
  • Concurrency — DBMS can handle concurrency.
  • Backup and Recovery — DBMS enables backups.

Duplicated data may be redundant. Redundant data can be deduced from other data in the database, it can be deleted without information being lost.

Data is sometimes duplicated to form relationships, this isn’t redundant as it provides information about the relationship and so shouldn’t be deleted.

Data Models

Fixed format and fixed number of fields.

  • Hierarchical model
  • Network Model
  • Relational Model

Relational Model — Made up of tuples that correspond to rows of a table. Each tuple is distinct (No duplicate values). Each column of the table corresponds to an attribute.

Each cell contains exactly one atomic value; the table must be in at least first normal form.

A relation is an unordered collection-the order of columns / attributes is irrelevant.

To uniquely identify each tuple, an attribute or set of attributes are used as a primary key.

No attribute of a primary key can be null.

If more than one attribute is used the primary key is a composite primary key.

Relations

Tables/relations are linked using one-to-many relationships by posting a copy of the primary key of the parent table on one side of the relationship as a foreign key into the child table at the many side of the relationship.

Referential Integrity — applies to these foreign keys. Value of the foreign key must match a primary key value of some row in the parent table.

Hierarchical Model — The data in this model is organised into a tree. At the top/root of the tree there is only one record type. Each record type can be the parent of many different record types.

Every child has one parent. Only one parent is ever allowed.

Very restrictive and only suits data that can be modelled in a hierarchical fashion (from one to many). Many to many relationships cannot be implemented without compromising the model.

Records are often linked in relationships using pointers.

Network Model — Hierarchical model is like a restricted network model. In Network Model each record can have many owners of a different record type.

Network model is more flexible. Allows more connections between records. Easier to support many to many relationships after decomposition, like the relational model.

Records can be linked using pointers.

--

--