My First (Automated) Dashboard! The SQL [Part 1]

Tyler Trice
7 min readNov 29, 2022

--

The (almost) finished dashboard!

In case you missed the first blogpost, here is a link to The Overview

I’m using Sequel Server Management Studio 18, but this section is going to be all SQL and no server. Instead, the database will be created and housed on my local C drive. In addition to SSMS, I also downloaded SqlLocalDB.msi for the local database setup, MsSqlCmdLnUtils.msi for BCP, and Microsoft Excel to get started.

First things first, I need to create a database to store my spending data. I used the SQL script below to create the PersonalFinanceV2 database. The original version can be found on mssqltips, the link for which is posted in the script. Let’s take a closer look at it:

-- Helpful Links
-- https://www.mssqltips.com/sqlservertip/7304/create-database-sql-server/
-- https://www.red-gate.com/simple-talk/databases/sql-server/database-administration-sql-server/sql-server-database-growth-and-autogrowth-settings/
USE master
GO
/*
Create database syntax with options for size & growth,
create log file with options for size & growth (Ideal to put log file on a different drive)
*/
CREATE DATABASE [PersonalFinanceV2]
ON
(NAME = N'PersonalFinanceV2', -- Logical name of the Data file
FILENAME = N'C:\Users\tyler\Databases\PersonalFinanceV2.mdf', -- The operating system file name
SIZE = 100MB, -- The size of the file
MAXSIZE = 5GB, -- The maximum size to which the file can grow, the default is MB
FILEGROWTH = 100MB) -- The automatic growth increment of the file, the default is MB
LOG ON
(NAME = N'PersonalFinanceV2_log', -- Logical name of the Log file
FILENAME = N'C:\Users\tyler\Databases\PersonalFinanceV2_Log.ldf', -- The operating system file name
SIZE = 5MB, -- The size of the file
MAXSIZE = 1GB, -- The maximum size to which the file can grow, the default is MB
FILEGROWTH = 5MB); -- The automatic growth increment of the file, the default is MB
GO

A database consists of two main parts. One part to store the data and another part to store transactional information about the database. When creating the database, we specify these two main parts as the master database file (mdf) and the log database file (ldf).

Okay, database created! Next up, I need a table to store my raw credit card data. I mentioned previously that I downloaded this in the form of csv files. I need to take a look at one of the files and the data fields to get an idea of what the table needs to look like, this is what I’m working with:

Fields > Data Type:

  • TransactionDate > DATE
  • PostedDate > DATE
  • Card No. > SMALLINT
  • Description > NVARCHAR(*)
  • Category > NVARCHAR(*)
  • Debit > DECIMAL(20,2)
  • Credit > DECIMAL(20,2)

To determine the length of the NVARCHAR fields I used an Excel formula to get the length of the string and then sorted the formula column to see the largest values.

After determining the lengths for the Description & Category fields, I created the table CreditCardTransactionsRaw with the script below.

USE PersonalFinanceV2
GO

CREATE TABLE CreditCardTransactionsRaw (
TransactionDate DATE,
PostedDate DATE,
CardNumber SMALLINT,
Description NVARCHAR(50),
Category NVARCHAR(32),
TransactionOut DECIMAL(20,2),
TransactionIn DECIMAL(20,2)
);
GO

Now I have a database and a table to store my data, but it’s important to think ahead here about potential future problems. In the raw table’s current state there is no way to find specific records easily, so we need an ID field to identify and query unique records. Additionally, I prefer that my non-raw data tables have a CreateDate & CreateUser fields as a way to better track how the data gets into the table.

With those things in mind, I created another table, CreditCardTransactions, with the script below. (I realized after the fact that I named my ID field ‘PurchaseID’ when not every record is technically a purchase but oh well)

USE PersonalFinanceV2
GO

CREATE TABLE CreditCardTransactions (
PurchaseID INT IDENTITY(1,1),
TransactionDate DATE,
PostedDate DATE,
CardNumber SMALLINT,
Description NVARCHAR(50),
Category NVARCHAR(32),
TransactionOut DECIMAL(20,2),
TransactionIn DECIMAL(20,2),
CreateUser NVARCHAR(32),
CreateDate DATETIME
PRIMARY KEY (PurchaseID)
);
GO

This looks pretty similar to the last script, but there are two key differences other than the new fields:

  1. IDENTITY(1,1) > This creates a number that automatically increments when new records are added to the table. The number starts with the first parameter and increments by the second parameter.
  2. PRIMARY KEY (PurchaseID) > This sets the PurchaseID field to be the primary key of the table. This means that the values in the field must be unique as they will be used as the primary way to identify records.

Now I have two tables! But how do I get the data from one into the other? Let me introduce you to triggers! Triggers are basically stored procedures that execute after a particular action has occurred in the database.

Okay, but how does a trigger help? Well, I want my data to write to the CreditCardTransactions table after I have imported it into CreditCardTransactionsRaw table. Using the script below I created a trigger that did just that.

USE PersonalFinanceV2
GO

CREATE TRIGGER dbo.InsertRawData
ON dbo.CreditCardTransactionsRaw
AFTER INSERT

AS
BEGIN
SET NOCOUNT ON;

DECLARE @CreateDate DATETIME
SET @CreateDate = GETDATE()

DECLARE @CreateUser NVARCHAR(20)
SET @CreateUser = 'InsertRawDataTRIGGER'

INSERT INTO dbo.CreditCardTransactions (TransactionDate, PostedDate, CardNumber, Description, Category,
TransactionOut, TransactionIn, CreateUser, CreateDate)

SELECT
INSERTED.TransactionDate
, INSERTED.PostedDate
, INSERTED.CardNumber
, INSERTED.Description
, INSERTED.Category
, INSERTED.TransactionOut
, INSERTED.TransactionIn
, @CreateUser
, @CreateDate

FROM
INSERTED

END

Notice how there is no data inserted into the PurchaseID field? That’s because the IDENTITY is going to get the value for me when the record is inserted into the table.

So, the trigger gets the data from one table into the other, but how do I get the data from the csv file into the raw table in the first place? I need a stored procedure for that! But before I get to the stored procedure, let’s take a look at the csv files again. This time in notepad rather than Excel.

There are two important things to look out for here: the delimiter and the line endings. I can see from the screenshot above that the data is delimited with commas. As for the line endings, that little Unix(LF) at the bottom is telling me that the file has Unix row terminators. Why does this matter? Because it directly affects the import script and whether or not the data will be read properly when importing. Below is the script I used to import the data.

USE PersonalFinanceV2
GO

CREATE PROCEDURE dbo.CreditCardRawImport(@Path NVARCHAR(256), @FileName NVARCHAR(256))
AS
BEGIN

DECLARE @FullFilePath NVARCHAR(512)
DECLARE @ErrorFilePath NVARCHAR(512)

SET @FullFilePath = CONCAT(@Path,
'\',
@FileName);

SET @ErrorFilePath = CONCAT(@Path, '\raw_insert_bad_data.txt')

DECLARE @SqlCommand NVARCHAR(max)
SET @sqlCommand =
'
BULK INSERT dbo.CreditCardTransactionsRaw
FROM ''' + @FullFilePath + '''
WITH (
FIELDTERMINATOR = '',''
, ROWTERMINATOR = ''0x0a''
, FIRSTROW = 2
, FIRE_TRIGGERS
, ERRORFILE = ''' + @ErrorFilePath + '''
)
'
EXEC (@SqlCommand)

END

Don’t worry, it’s not as scary as it looks! There are a few things I want to cover in this script:

  • Field Terminator > This is my file’s data delimiter.
  • Row Terminator > This is my file’s line endings, as I mentioned above the file is Unix(LF). But what if it was Windows(CRLF)? In that case I would use \n rather than 0x0a. (This will come back up in a future post so just take a mental note of this.)
  • First Row > This determines the first line where data will be read from the file. I use this to skip the file headers, it works but I don’t think it is technically meant to do that.
  • Fire Triggers > This needs to be specified so that the trigger I created will actually fire. The default setting for bulk insert does not set off triggers.
  • Error File > This will create a file with any errors from the import process. It’s not necessary but it can be very helpful for identifying issues with the imported data.

With the stored procedure created, I had everything I needed to import my data into the database! Here is a look at all the pieces in this post put together:

With my data now in a structured table I can perform some basic data analysis! Let’s take a look at some things like the range, min & max values, and totals by category for this credit card data.

USE PersonalFinanceV2
GO

SELECT
TOP 1
Description,
TransactionOut [BiggestPurchase]
FROM
CreditCardTransactions
ORDER BY
TransactionOut DESC

-----------------------------------

SELECT
TOP 1
Description,
TransactionOut [SmallestPurchase]
FROM
CreditCardTransactions
WHERE
TransactionOut IS NOT NULL
ORDER BY
TransactionOut ASC

---------------------------------

SELECT
max(TransactionOut) [BiggestPurchase],
min(TransactionOut) [SmallestPurchase],
min(TransactionDate) [EarliestPurchaseDate],
max(TransactionDate) [LatestPurchaseDate],
DATEDIFF(day,min(TransactionDate),max(TransactionDate)) [NumberOfDays],
avg(TransactionOut) [MeanPurchase],
sum(TransactionIn) - sum(TransactionOut) [NetEarnings]
FROM
CreditCardTransactions

---------------------------------


;WITH CategoryTotals AS (

SELECT
Category,
ISNULL(sum(TransactionOut),0) [TotalSpent],
ISNULL(sum(TransactionIn),0) [TotalCredited]
FROM
CreditCardTransactions
GROUP BY
Category


UNION

SELECT
'Total',
sum(TransactionOut),
sum(TransactionIn)
FROM
CreditCardTransactions
)
SELECT
*
FROM
CategoryTotals
ORDER BY
Category ASC

That’s going to wrap up Part 1! I honestly thought I would be able to detail all the SQL in this blogpost but it’s getting pretty long so I’ll stop at this point. In my next post I’ll cover exporting the data from SSMS, using PowerShell to convert it to an Excel file, and using PowerShell to format the data into a table for my dashboard.

Thanks for reading! Feel free to post any questions or feedback!

--

--

Tyler Trice

IT guy who always has Excel and SSMS open. I spend my time writing SQL, Python, & PowerShell scripts and QCing reports. I also enjoy soaking up the Florida sun.