How a tug of war on a Database deadlock led me to seek answers to my questions?
A decade ago, I made a short film about Deadlock, over the years I have seen many instances of Deadlock in the Java layer, investigated it and resolved it.
Fortunately or unfortunately, I had never seen a Deadlock happening while using an SQL server, may be the projects that I worked on had world class DBAs or had less intensive concurrent write operation or may it be the best service layer’s transaction code or no one noticed it or its just a magic.
Contents
- Why this now?
- Tug of war
- Starting over
- Where are these data rows stored?
- Page
- Extents
- Table
- HoBt
- Partitions
- File
- File group
- Database - Taking a second look
- Locks
- Who takes care of these lock mechanism?
- When does it acquire lock?
- Why lock on different objects?
- What are the different mode of locks acquired on objects? - The third look
Why this now?
Last week, we hit with a SQL Deadlock, the SQL Server started throwing error 1205 and rolled back our transaction.
That’s when, I realized I have almost forgotten the concept of Database deadlock which was part of my 2nd year database class in my Bachelors.
The curious developer in me started bugging me up to brush up my memory and check see if there is anything left in my memory.
You know what the result would have been after performing a seek operation into the nook and corner of my memory,
I do get it, its been well more than a decade.
Tug of war
Keeping the human memory management thing aside, this happened.
Me: Let me see how to replicate it, so I can fix it.
Curious developer me: You might not know what’s happening behind the scene.
Me: I know what a deadlock is, I can handle it.
Curious developer me: You would do it wrong.
Me: I know what I am doing.
Curious developer me: No, you don’t.
Me: Yes, I do.
Me: Well…, What do you want me to do now?
Curious developer me: Start over.
If you are just as me skip to the actual modes or if you are just one another curious developer then start over.
Starting over
So, I tried replicating a small deadlock use case,
Me: It looks ok, but there are lot of unknowns in here
Curious developer me: Yeah! I know. Wouldn’t it be nice if you actual know what these are.
Me: May be, ah! ah!
Ok.
Let me give it a try.
There is very good documentation on internals and architecture at learn at Microsoft. I will just go over what I wanted to know here.
Where are these data rows stored?
Well for starters,
- indexed row data are stored in B+ tree
- non indexed row data is stored in a heap.
Curious developer me: Oh! you did know that?
Me: Hey! I know how to write to a table and read from a table, do a join, procedures, functions and what not. Never had to know the data structure behind it, I didn’t care.
Curious developer me: Well, now you know.
So, they are stored on heap and B+ tree, but what does the actual node contain?
Yes, it is like a Christmas tree with a lot of cards on it, but they call it Pages.
So, each node in the heap or a B+ tree is a Page.
Page
A Page can be any of the following type,
Index page
Like a physical book having an index page pointing to the content,
the database engine maintains the indexes pointing to other index page or a data page.
The above diagram represent, root and intermediate page pointing to other pages, and leaf nodes contains the actual data pages.
Data page
A typical data page contains the actual data. They have the following information,
— Page header row contains,
- object_id — page owned by which object. eg.dbo.Persons Table’s Object ID
- index_id — ID of the index
- Heap id is 0
- Clustered id is 1
- Non-Clustered id is > 2 - partition_id — if the table is not explicitly partitioned, database engine will assign a partition on it own.— Data row contains varchar, int, text column data
Each data page can hold up to 8kb of data. So, if the row’s column value size gets updated and is larger, then it is moved to different page dynamically.
Me: Now, I get the real reason behind, why select LargeEmailBodyMessage from EmailTable where EmailId=5 took a little longer than select EmailFrom from EmailTable where EmailId=5.
Rather, if we feel that the column is not searchable we can make the column as binary for faster retrieval.
Finally in the diagram, the row offset table helps to seek the rows faster as it contains a memory pointer to the actual row calculated based on bytes starting from the first byte of the page.
System page
It stores metadata information.
The page may also contain other page type such as
- Global Allocation Map — for tracking what extents are allocated
- Page Free Space — for tracking free space
- Index Allocation Map — for listing the extents used by a table or index used by allocation unit, has three types
- IN_ROW_DATA — holds a partition of heap or index
- ROW_OVERFLOW_DATA — holds data that exceeds the 8kb page limit
- LOB_DATA — holds large object like varchar(max) - Bulk/Differential Changed Map — for tracking modification in extents
Curious developer you: If you are interested in getting deep into IAM this post by Liwei Yin might be helpful.
The following picture shows a typical db with multiple pages, as we see the page_id changes and the slot_id starts again from 0.
Check the following post for more info on the query
https://medium.com/p/74cb5ae86e0c/edit#1ec1
Extents
And 8 of these consecutive pages form extents like a chapter in a book, they could have the data from the same table or a different table.
Here is the post explaining the query executed,
https://medium.com/p/74cb5ae86e0c/edit#9c8b
Table
Unlike our table that we see in here in SSMS,
The typical table data could be spanned over multiple pages.
HobT
Pronounced as Hobbit is a collection of data and index pages within a single Partition. They are in a 1–1 relation with Partition.
Partitions
Every table or view is composed of at least one partition whether or not we explicitly specify the table partitions.
The following are the partition created by SQL server, where each partition_id is unique in the database.
Tables could be split into partitions and could be stored across file group. Each partition has either a heap or a B-tree as in the above picture represented by hobt_id.
The query shows how to get the partition_id of a object(table)
https://medium.com/p/74cb5ae86e0c/edit#ce6a
Database file
These pages/extents are stored in the database file like a book, which is stored on the disk.
Here,
- the primary data file contains startup information and pointer to other files (.mdf) and also contains object and data.
- the secondary data file contains more data (.ndf) in case of complex tables.
- the transaction log file contains log transactions (.ldf)
A typical table would at least have a Primary Data file and Transaction log. A complex table might have a Primary Data file and multiple additional Secondary Data file along with Transaction log.
Filegroup
They are the collection of primary files and secondary files grouped together. Where multiple secondary files can be in different disk drive to enhance performance.
Say, when a table is created within the filegroup, the queries to the table are spread across multiple disk.
- MyDB_Prm.mdf is the Primary file where the tables that are created without specifying any Filegroup will be stored.
- Say, when us use the following Query to create the table,
CREATE TABLE MyTable
( cola int PRIMARY KEY,
colb char(8) )
ON MyDB_FG1;
then the table will be created in MyDB_FG1 filegroup.
Database
The following will give you the DatabaseId of the current database,
SELECT DB_ID() AS [Current Database];
Now we have all the required details to map a page to a partition.
The following query returns page information from the database given (DatabaseId, FileId, PageId, Mode — DEFAULT or DEATAILED)
SELECT *
FROM sys.dm_db_page_info (8, 1, 392, DEFAULT);
The above function replaces need to use DBCC PAGE
Curious developer me: Now you know the basic building blocks that you need to know.
Me: Yes, I am getting there.
Taking a second look
The same diagram, a second look now,
So, the rectangle boxes are the resources. And by the process id we could infer that the oval ones are the concurrent processes involved in the deadlock.
Me: I infer some of it.
Curious developer me: Keep going! You are doing a good job.
Locks
The database engine could acquire different lock mode say Shared or Update on different objects like a row, page, table during a transaction to make sure it abides ACID properties.
Who takes care of these lock mechanism?
The lock manager takes care of it. The application do not have to request the locks directly.
When does it acquire lock?
The database engine does it when a concurrent read or write happens.
Why lock on different objects?
Different objects are locked at different levels to minimize the cost of locking and to enhance concurrency.
Say, if the where condition has filtered match on only a few rows in a page of a table without a indexed column, it would only need a lock on those rows not the page or not the entire db, so it would acquire a RID lock.
Thus paving way for other process to access data from the Page or Extent.
What are the different mode of locks acquired on objects?
So the read or write determines what lock mode it is acquired.
The third look
Now taking a third look at the diagram
Me: Things are much better now.
Curious developer me: What did you understand from the above graph
Me: To say, I am not a database expert, from what I have read and I could see from the graph is Process 69 is the owner of the lock on the associatedObjectId, also I see the the Process 78 is the owner of the same associatedObjectId.
Wait what? How could two exclusive locks be acquired on the same object?
Curious developer me: Well, now is the time for executing few deadlock cases and understand it and try to make connections.
The following article covers the exercise to understand at least the RID Lock and Key Lock which seems to be like the most common ones.
Hope you would also have had a similar conversation as a Developer!
End of the day, we can’t learn everything, its a ocean. It is just our curiosity trying to understand at least what’s around the path we navigate thru.
Happy learn-ing!