I was recently asked about visualizing data contained in SQL Server, using the ELK Stack. While the client eventually opted to use a local developer, I decided to quickly throw an article together.

I grabbed some sample CSV data from the SpatialKey website here: (https://support.spatialkey.com/spatialkey-sample-csv-data/), specifically the Sales Transactions sample, and proceeded to load it into SQL Server 2016. While I’ll be the first to admit that this is a contrived sample, the potential client called for the data to be sourced from SQL Server.

SQL Server — create a table:

Create Table [SalesTransactions2009].dbo.[Sales]
(
Transaction_date DATE,
Product NVARCHAR(50),
Price DECIMAL(12,2),
Payment_Type NVARCHAR(50),
[Name] NVARCHAR(50),
City NVARCHAR(50),
State NVARCHAR(2),
Country NVARCHAR(50),
Account_Created DATETIME,
Last_Login DATETIME,
Latitude DECIMAL(12, 6),
Longitude DECIMAL(12,6)
);
GO

and import these data using SQL Bulk Insert:

BULK INSERT SalesJan2009 FROM 'C:\SalesJan2009\SalesJan2009.csv' WITH  (
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
ERRORFILE = 'C:\SalesJan2009\ErrorRows.csv',
TABLOCK )

Preparing and exporting data from SQL Server to ingest into the ELK stack:

There are three options (at least what I can think of, off-hand):

  1. LogStash
  2. JSON into ElasticSearch
  3. FileBeat

In the interest of brevity, I’m only showing the LogStash option here and will leave the others as an exercise for the reader.

Export from SQL Server:

Create two SPs, one to export to CSV (for LogStash) and the other to export to JSON (for direct ElasticSearch ingestion). There are a myriad examples on the web. If you have an SQL Server version < 2016, one can structure an XML export to resemble JSON.

Create a scheduled export (a number of ways exist), and let it create a CSV file. I realize that I could just have used the original CSV file, but that would have me not use SQL Server.

Create the ElasticSearch index:

{ 
"settings": {
"number_of_shards" : 1
},
"mappings" : {
"type" : {
"properties" : {
"location" : {
"type" : "geo_point"
},
"Transaction_date" : {"type" : "date" , "format" : "MM/dd/YY HH:mm" },
"Account_Created" : { "type" : "date" , "format" : "MM/dd/YY HH:mm"},
"Last_Login" : { "type" : "date" , "format" : "MM/dd/YY HH:mm"}
} } } }

Then create the logstash.conf file:

input { 
file {
path => "C:/sales_jan2009/SalesJan2009.csv"
start_position => "beginning"
sincedb_path => "C:/sales_jan2009/sinceDb" }
}
filter {
csv {
separator => ","
columns => ["Transaction_date","Product","Price","Payment_Type","Name","City","State","Country","Account_Created","Last_Login","Latitude","Longitude"]
remove_field => ["message"] }
mutate {
convert => { "Longitude" => "float" }
convert => { "Latitude" => "float" }
convert => { "Price" => "float" }
}
mutate { . . .
(Remainder left as an exercise…)

Making sure to have LogStash “watch” the data file for changes (using sincedb), one can then start ElasticSearch, Kibana and LogStash.

If one is lucky, the LogStash log will have entries such as this:

[2017-06-06T15:21:18,342][DEBUG][logstash.pipeline ] output received { 
"event"=>{"Product"=>"Product1", "Latitude"=>47.60639, "City"=>"Seattle ",
"Account_Created"=>"1/8/09 18:59", "Payment_Type"=>"Visa",
"Longitude"=>-122.33083, "Name"=>"Reto", "
path"=>"C:/sales_jan2009/SalesJan2009.csv",
"@timestamp"=>2017-06-06T20:21:15.247Z, "Price"=>1200.0,
"Last_Login"=>"2/18/09 15:51", "State"=>"WA", "@version"=>"1",
"host"=>"SP01", "Country"=>"United States",
"location"=>"47.60639,-122.33083",
"Transaction_date"=>"1/9/09 15:03"}}

Switching to Kibana, one can then create individual visualizations and place these onto a dashboard.

Feel free to ping me with commercial requests. I work on Windows, Linux and occasionally other platforms. My forte is Enterprise Search (ElasticSearch, SOLR, SharePoint), Big Data, and SharePoint.

Steph van Schalkwyk step@remcam.net

--

--