Database transactions use the ACID property for data integrity and consistency.

The ACID Property For Database Transactions

Vincent T.
0xCODE
Published in
8 min readJul 6, 2020

--

Data integrity in a database is important when it comes to transactions. The data must be verified correct and should persist after being stored. When designing a database it is important to take the ACID methodology into consideration. The acronym, coined by Andreas Reuter and Theo Härder, stands for 4 properties developers should follow: Atomicity, Consistency, Isolation and Durability. This allows a way to validate a transaction from the time of data input to the final output, which is stored in a database. This meets the requirements for error handling in the event of unforeseen circumstances (e.g. power failures, hardware malfunction, etc.) and bugs in code that could affect the results of a transaction.

Atomicity

Multiple statements make up a transaction. These statements can be broken down into the parts of the transaction like the input, data processing, checking, storage and output. In order for a transaction to complete, all the steps in the process must be completed. It is like marking each step in the process with a checkmark if this were a list, indicating it has been completed. If a step in the process fails, then the transaction is considered a failure and should be rolled back or reset.

Atomicity makes sure that before a transaction is committed, it must have completed successfully. Otherwise it must fail completely so that nothing in the process is left incomplete. Take for example a transaction that requires the input of the number of items a person wants to order. If that process completes, but fails to compute the amount, then the transaction should not complete or else the person will not be charged for the items they ordered. There must be steps to verify and validate each step in the process.

Assume 3 Processes A, B and C. For a transaction t to complete, all processes must be validated and have completed.Therefore, it must follow an AND condition:t = ABC = 1The transaction cannot be an OR condition:t = A+B+C = 0

The reason a transaction must be validated is in order to keep the database in a consistent state. Any partial saving of data can lead to inconsistencies that can severely affect the accuracy of information. This also affects data integrity, so that is very important in enterprise business operations which relies on data accuracy for accounting purposes used for financial records and reporting.

Consistency

A transaction must be one completed operation in order for data to be consistent. Validation rules must be put in place to ensure data consistency. If there is no consistency during the validation process, then the entire transaction should be nullified and reset. This allows the transaction to return to known state which allows initialization back to the first step in the process.

Take for example a database field called SSNUMBER(10). It can only accept a 10 digit numeric value (0 to 9), which must be enforced by a routine that rejects anything lower than 10 digits and cannot be greater than 10 digits. When used in an online application form, a routine must be included to check the validity of the data input. It will be very problematic down the road if field accepted more than 10 digits and allowed alphanumeric characters when it should only be numeric.

Let A + B = 1000, where A and B are any integer that when added with each other results in 1000.A or B cannot be > 1000, but can be = to 1000 or 0.e.g. 
A = 0, B = 1000
A + B = 1000 (Accept)
A = 1000, B = 1000
A + B = 2000 (Reject)
I can create a simple checker to make sure that the input data for A or B can only be 0 to 1000, and when added together it cannot be > 1000.

When updating databases, routines called validation rules or check rules are implemented to make sure that the data is correct based on the conditions in the program. Like in the example of A + B = 1000, when committing data in a database it must follow the same property. If the results returned A + B = 2000, the transaction must be canceled and the system must be rolled back to a pre-transaction state.

Isolation

In order to prevent conflicts in the database which could result in inconsistencies, there must be process isolation. This ensures that process t1 and t2 don’t attempt to execute at the same time and write data to the database. This is the purpose for queuing a process in an execution pipeline, which is performed in hardware by the CPU. However, in the application or program there must also be control method to make sure that processes do not interfere with one another.

It may not be apparent, but the need for controlling transactions can affect business operations in finance and accounting applications. Let us say that we have a transaction t1 that adds $500 to Bob’s account from Alice’s account. Then there is another transaction t2 in which Bob transfers $1000 to Carol’s account. The following example shows the steps that need to be processed.

t1:
Subtract $500 from Alice's account
Add $500 to Bob's account
t2:
Subtract $1000 from Bob's account
Add $1000 to Carol's account

If both transactions attempt to run at the same time, it can lead to an error. That is because if one attempt to overwrite each other, it can affect the account balances of the users. If in t1 Bob only has $800 in his account prior to the amount he received from Alice, and t2 was processed first, then Bob would not be able to send $1000 to Carol because he does not have enough funds (-$200). If both transactions arrived at the same time, it is like subtracting and adding from Bob’s account at the same time.

In our example, Bob only has a balance b = $800 in his account prior to t1 and t2. When you subtract and add at the same time it results in the following:

t1:
b (balance) = $800
a (amount received) = $500
t1 = b + a = $1300t2:
b (balance) = $800
a (amount subtracted) = $1000
t2 = b - a = -$200

According to t1, Bob’s balance is $1300. In t2 the balance is -$200, meaning Bob did not have enough funds to transfer $1000 to Carol’s account. How do you reconcile two different balances that result from the transactions running at the same time? Bob’s balance will not be settled automatically when this occurs, so this is what leads to bigger problems.

This is why the transactions must be isolated properly based on timestamps and record locks. Obviously Bob will not be able to transfer $1000 prior to receiving $500 in his account from Alice. The $500 added to Bob’s current balance will be enough to allow him to transfer $1000 to Carol. Therefore, t1 must be processed before t2. Isolating the two transactions based on the proper order by timestamp and locking each record to prevent simultaneous updates effectively maintains consistency in data.

Durability

Persistent data storage and backup tests a database system’s effectivity. This is part of the durability property of databases. Database management systems no longer reside in silos of information that reside in office server rooms. They are now on the network, either in local data centers or on the cloud. Making sure that the data is properly secured and managed by an administrator is the most important part of database administration.

Administrators must be able to commit data from buffer to persistent storage and perform proper backup schedules. This is also a part of business continuity planning procedures in the event of a system failure. When the server crashes or when the network if offline, the data must be backed up so it can be recovered later or accessed from another system. Servers should also have backup UPS (Uninterruptible Power Supply) to allow administrators time to gracefully shutdown a system in the event of a power failure to prevent data corruption caused by servers suddenly losing power and not committing data properly.

Information is an important part of a business, and when it is lost it can be gone forever. Backup systems could malfunction, network connections could fail and host of other problems that administrators have to deal with. Transactions once recorded should persist in a non-volatile memory storage device with a backup (e.g. 3–2–1 Rule For Data Backup). The cost of lost data is much more than recovering data. An example here is losing a database of thousands of customers due to no backups. With a backup system the data can be recovered in a few hours at cost to an administrator or contractor. Without a backup system, the company will need to gather customer information again to create a new database which takes more time and higher costs. Modern databases can be recovered today using transaction logs, but a backup is a better guarantee for any type of data recovery.

Implementation

Here is a sample code (from Microsoft) of a SQL transaction with comments that describe how ACID properties are applied. This example shows how to roll back only the modifications made by a stored procedure if an active transaction is started before the stored procedure is executed.

USE AdventureWorks2012;  
GO
IF EXISTS (SELECT name FROM sys.objects
WHERE name = N'SaveTranExample')
DROP PROCEDURE SaveTranExample;
GO
CREATE PROCEDURE SaveTranExample
@InputCandidateID INT
AS
-- Detect whether the procedure was called
-- from an active transaction and save
-- that for later use.
-- In the procedure, @TranCounter = 0
-- means there was no active transaction
-- and the procedure started one.
-- @TranCounter > 0 means an active
-- transaction was started before the
-- procedure was called.
DECLARE @TranCounter INT;
SET @TranCounter = @@TRANCOUNT;
IF @TranCounter > 0
-- Procedure called when there is
-- an active transaction.
-- Create a savepoint to be able
-- to roll back only the work done
-- in the procedure if there is an
-- error.
SAVE TRANSACTION ProcedureSave;
ELSE
-- Procedure must start its own
-- transaction.
BEGIN TRANSACTION;
-- Modify database.
BEGIN TRY
DELETE HumanResources.JobCandidate
WHERE JobCandidateID = @InputCandidateID;
-- Get here if no errors; must commit
-- any transaction started in the
-- procedure, but not commit a transaction
-- started before the transaction was called.
IF @TranCounter = 0
-- @TranCounter = 0 means no transaction was
-- started before the procedure was called.
-- The procedure must commit the transaction
-- it started.
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
-- An error occurred; must determine
-- which type of rollback will roll
-- back only the work done in the
-- procedure.
IF @TranCounter = 0
-- Transaction started in procedure.
-- Roll back complete transaction.
ROLLBACK TRANSACTION;
ELSE
-- Transaction started before procedure
-- called, do not roll back modifications
-- made before the procedure was called.
IF XACT_STATE() <> -1
-- If the transaction is still valid, just
-- roll back to the savepoint set at the
-- start of the stored procedure.
ROLLBACK TRANSACTION ProcedureSave;
-- If the transaction is uncommitable, a
-- rollback to the savepoint is not allowed
-- because the savepoint rollback writes to
-- the log. Just return to the caller, which
-- should roll back the outer transaction.

-- After the appropriate rollback, echo error
-- information to the caller.
DECLARE @ErrorMessage NVARCHAR(4000);
DECLARE @ErrorSeverity INT;
DECLARE @ErrorState INT;

SELECT @ErrorMessage = ERROR_MESSAGE();
SELECT @ErrorSeverity = ERROR_SEVERITY();
SELECT @ErrorState = ERROR_STATE();

RAISERROR (@ErrorMessage, -- Message text.
@ErrorSeverity, -- Severity.
@ErrorState -- State.
);
END CATCH
GO

Synopsis

Database management systems require data consistency for presenting accurate information in reports. This is important in today’s world of online transactions that use distributed databases. A system like ACID helps to provide a way to validate data for integrity in order to prevent inconsistencies in transactions.

--

--

Vincent T.
0xCODE

Blockchain, AI, DevOps, Cybersecurity, Software Development, Engineering, Photography, Technology