Elasticsearch and Logstash: A Dynamic Duo for MSSQL Server Data Search

Sachidanand Upadhyay
Simform Engineering
7 min readNov 23, 2023

Efficiently index, search, and visualize data with this winning combination.

Elasticsearch, an open-source search and analytics engine, is a powerful tool often used for full-text search and real-time data analysis. In this blog post, we will explore how Elasticsearch can be combined with Logstash to search and query data from Microsoft SQL Server. Before we dive into the setup, let’s briefly understand what Elasticsearch is and why it’s useful.

What is Elasticsearch?

Elasticsearch is a powerful and versatile open-source search and analytics engine. It is often used for full-text search and real-time data analysis.

It offers several key features:

  • Search Engine: Elasticsearch uses a distributed, multi-node architecture to search through massive datasets rapidly.
  • Full-Text Search: It can analyze and index text data, making it easy to search for specific words or phrases within documents.
  • Real-Time: Elasticsearch provides real-time search capabilities, which means it can index and retrieve data almost instantly as it’s added or updated. This makes it suitable for applications where up-to-date information is crucial, like e-commerce product catalogs or log analysis.
  • Scalability: Elasticsearch can handle large amounts of data across multiple nodes. It can easily expand to accommodate growing datasets or increased search traffic.
  • Schema-less: It’s schema-less, meaning you don’t need to define a rigid structure for your data before indexing it. This flexibility is particularly useful for semi-structured or unstructured data
  • Multi-Purpose: Beyond search, Elasticsearch serves various other purposes, including log and event data analysis, business intelligence, and monitoring.

Prerequisites

Before setting up Elasticsearch and Logstash for MSSQL Server data search, you’ll need the following prerequisites:

  • Docker Desktop: Install Docker from the official website you can use any operating system for the Docket desktop.
  • SQL Server JDBC Driver: Download the JDBC driver from Microsoft’s official website or the GitHub link provided.
  • VS Code Editor: You can choose your preferred code editor, but Visual Studio Code is a popular choice.

Setting up the project

Let’s create a POC for Elasticsearch with MSSQL Server & Logstash

Project Structure

  1. Create a folder in an older structure that allows you to organize your project. In this example, we have three folders: logstash, mssql, and ElasticSearchRestApi.

2. Inside the logstash folder, create a config folder and add a logstash.config file with the following configuration for connecting to MSSQL Server and Elasticsearch:

input { 

jdbc {

jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"

jdbc_connection_string => "jdbc:sqlserver://sql1:1433;encrypt=false;databaseName=AdventureWorks2017;user=sa;password=qweQWE123"

jdbc_user => "sa"

jdbc_password => "qweQWE123"

jdbc_paging_enabled => true

schedule => "*/59 * * * * *"

statement => "SELECT [ProductID] as id ,

[Name]

,[ProductNumber]

,[MakeFlag]

,[FinishedGoodsFlag]

,[Color]

,[SafetyStockLevel]

,[ReorderPoint]

,[StandardCost]

,[ListPrice]

,[Size]

,[SizeUnitMeasureCode]

,[WeightUnitMeasureCode]

,[Weight]

,[DaysToManufacture]

,[ProductLine]

,[Class]

,[Style]

,[ProductSubcategoryID]

,[ProductModelID]

,[SellStartDate]

,[SellEndDate]

,[DiscontinuedDate]

,[rowguid]

,[ModifiedDate] FROM [Production].[Product]"

tracking_column => "ProductID"

tracking_column_type => "numeric"

use_column_value => true

lowercase_column_names => true

}



}



output {

stdout { codec => json_lines }

elasticsearch {

hosts => ["elasticsearch:9200"]

index => "inx_product"

doc_as_upsert => true

document_id => "%{id}"

}

}

3. Create a Dockerfile in the root folder of logstash for building the Logstash container.

FROM docker.elastic.co/logstash/logstash:7.13.2 



RUN rm -f /usr/share/logstash/pipeline/logstash.conf



USER root



COPY mssql-jdbc-11.2.3.jre8.jar /usr/share/logstash/logstash-core/lib/jars/mssql-
jdbc-11.2.3.jre8.jar

4. Ensure the mssql-jdbc-11.2.3.jre8.jar file is attached to the root folder after configuring the logstash folder, config file, and Dockerfile.

MSSQL Configuration

  1. Create a Dockerfile in the root folder of mssql for building the MSSQL Server container:
FROM mcr.microsoft.com/mssql/server:2019-GA-ubuntu-16.04 

ENV SA_PASSWORD "qweQWE123"

ENV ACCEPT_EULA "Y"



USER root



COPY AdventureWorks2019.bak /var/opt/mssql/backup/AdventureWorks2019.bak



RUN chown -R mssql:root /var/opt/mssql/backup/AdventureWorks2019.bak



#RUN Chown=mssql:root

USER mssql



RUN /opt/mssql/bin/sqlservr --accept-eula & (echo "awaiting mssql bootup for 15 seconds" && sleep 15 && echo "restoring.." && /opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P 'qweQWE123' -Q "RESTORE DATABASE AdventureWorks2017 FROM DISK = '/var/opt/mssql/backup/AdventureWorks2019.bak' WITH MOVE 'AdventureWorks2017' TO '/var/opt/mssql/data/AdventureWorks2019.mdf', MOVE 'AdventureWorks2017_Log' TO '/var/opt/mssql/data/AdventureWorks2019.ldf'")

Note: This example uses the AdventureWorks database for demonstration purposes, attaching it as an MSSQL backup.

ElasticSearchAPI Configuration

  1. Create ElasticSearchApi rest API project:

dotnet new webapi –name ElasticSearchAPI

2. Install Nest NuGet package:

dotnet add package NEST — version 7.17.5

3. Configure the Program.cs file to include Nest:

using ElasticSearchPOC; 



var builder = WebApplication.CreateBuilder(args);

builder.Services.AddElasticSearch(builder.Configuration);

// Add services to the container.



builder.Services.AddControllers();

// Learn more about configuring Swagger/OpenAPI at https://aka.ms/aspnetcore/swashbuckle

builder.Services.AddEndpointsApiExplorer();

builder.Services.AddSwaggerGen();



var app = builder.Build();



// Configure the HTTP request pipeline.

if (app.Environment.IsDevelopment())

{

app.UseSwagger();

app.UseSwaggerUI();

}



app.UseHttpsRedirection();



app.UseAuthorization();



app.MapControllers();



app.Run();

4. Create a static class file, ElasticSearchExtension.cs, for Elasticsearch connection and configuration:

using ElasticSearchPOC.POCO; 

using Nest;



namespace ElasticSearchPOC

{

public static class ElasticSearchExtension

{

public static void AddElasticSearch(this IServiceCollection services, IConfiguration configuration)

{

var baseUrl = configuration["ElasticSettings:baseUrl"];

var index = configuration["ElasticSettings:defaultIndex"];

var settings = new ConnectionSettings(new Uri(baseUrl ?? "")).PrettyJson().CertificateFingerprint("6b6a8c2ad2bc7b291a7363f7bb96a120b8de326914980c868c1c0bc6b3dc41fd").BasicAuthentication("elastic", "JbNb_unwrJy3W0OaZ07n").DefaultIndex(index);

settings.EnableApiVersioningHeader();

//AddDefaultMappings(settings);

var client = new ElasticClient(settings);

services.AddSingleton<IElasticClient>(client);

CreateIndex(client, index);

}

private static void AddDefaultMappings(ConnectionSettings settings)

{

//settings.DefaultMappingFor<Product>(m => m.Ignore(p => p.Size));

settings.DefaultMappingFor<Product>(m => m.Ignore(p => p.Size).Ignore(p => p.Weight));

}

private static void CreateIndex(IElasticClient client, string indexName)

{

var createIndexResponse = client.Indices.Create(indexName, index => index.Map<Product>(x => x.AutoMap()));

}

}

}

5. Add the connection string to the appsettings.json file:

  "ElasticSettings": { 

"baseUrl": "http://localhost:9200/",

"defaultIndex": "inx_product"

}

Note: You’ll have to mention the Elastic Search base URL in the logstash.config file.

6. Add a controller and POCO class

This example has a controller in the controller folder.

using ElasticSearchPOC.POCO; 

using Microsoft.AspNetCore.Http;

using Microsoft.AspNetCore.Mvc;

using Nest;



namespace ElasticSearchPOC.Controllers

{

[Route("api/[controller]")]

[ApiController]

public class SearchController : ControllerBase

{

private readonly IElasticClient _elasticClient;

private readonly ILogger<SearchController> _logger;

public SearchController(ILogger<SearchController> logger , IElasticClient elasticClient)

{

_logger = logger;

_elasticClient = elasticClient;

}

[HttpGet]

public async Task<IList<Product>> GetSearch(string keyword)

{



var result = await _elasticClient.SearchAsync<Product>(s => s.Query(q => q.QueryString(d => d.Query('*' + keyword + '*'))).Size(100));

var finalResult = result;

var finalContent = finalResult.Documents.ToList();

return finalContent;





}

}

}

Create a POCO class (Product.cs) or model class for mapping the response:

using Nest; 



namespace ElasticSearchPOC.POCO

{

public class Product

{



public int id { get; set; }

public string? Name { get; set; }

public string? ProductNumber { get; set; }

public string? Color { get; set; }

public decimal? StandardCost { get; set; }

public decimal? ListPrice { get; set; }

public string? Size { get; set; }



public decimal? Weight { get; set; }

}

}

The project structure would look like this:

Docker-compose

1. Create a docker-compose.yaml file in the root folder to build and run the containers for Logstash, MSSQL, Elasticsearch, and Kibana:

version: '2.2' 

services:

mssql:

build: ./mssql

container_name: sql1

restart: always

ports:

- 1433:1433

networks:

- elastic

volumes:

- mssql:/var/opt/mssql

elasticsearch:

image: elasticsearch:7.13.2

container_name: elasticsearch

environment:

- cluster.name=docker-cluster

- bootstrap.memory_lock=true

- discovery.type=single-node

- "ES_JAVA_OPTS=-Xms512m -Xmx512m"

ulimits:

memlock:

soft: -1

hard: -1

volumes:

- data:/usr/share/elasticsearch/data

restart: always

depends_on:

- mssql

ports:

- 9200:9200

networks:

- elastic

logstash:

build: ./logstash

environment:

LS_JAVA_OPTS: "-Xmx256m -Xms256m"

ports:

- 5001:5001

container_name: logstash

restart: always

networks:

- elastic

depends_on:

- elasticsearch

volumes:

- ./logstash/config:/usr/share/logstash/pipeline

kibana:

image: docker.elastic.co/kibana/kibana:7.13.2

environment:

SERVER_HOST: 0.0.0.0

ELASTICSEARCH_HOSTS: http://elasticsearch:9200

container_name: kibana

depends_on:

- elasticsearch

ports:

- 5601:5601

networks:

- elastic



volumes:

data:

driver: local

mssql:

driver: local



networks:

elastic:

driver: bridge

The project structure would look like this:

Building and Running the Project

Build the Docker images and deploy the containers:

docker-compose up

Run the ASP.NET Core project:

dotnet run

Verification

Verify that the Docker containers are up and running as expected.

You can also test the API to ensure Elasticsearch is functioning correctly.

Check logstash logs for refreshing the dataset

You can refer to the entire GitHub code here.

For more such technical insights and development trends, follow the Simform Engineering.

--

--

Sachidanand Upadhyay
Simform Engineering

Experienced Senior Software Engineer | Passionate about coding, problem-solving, and innovation | Tech enthusiast ready to tackle challenges