Data Analytics at NSoft

Filip Ćorluka
NSoft
Published in
5 min readMay 29, 2019

Introducing NSoft’s BAS and NDS

The formation of BAS (Business Analytics Specialists) team started 2 years ago with the arrival of a small BI team to NSoft in order to face new challenges in a completely new industry.

Considering our business background and experience in Data warehouse (DWH) development and maintenance, reporting and data provisioning, as well as the previous, almost decade long, cooperation and teamwork, we were always well aware of the importance of data and its impact on business performance. But what we have encountered at NSoft has solidified our thinking that the sole data provisioning is the thing of the past — the present data provisioning is being performed by automated systems without human interaction. That was the starting point of something that consequently evolved into a completely internally developed full-fledged Data Analytics Solution backed by BAS team and data engineers in NDS (NSoft Data Service) team. Implemented, with the best practices in mind, we are currently operating Data warehouse (DWH) capable of nearly real-time processing and ingesting millions of data points per second which then feed into our analytics layer. It resulted with a clear new goal to strive for — Data Analytics.

What is a Data Analyst?

Data Analyst by definition is someone who is able to successfully translate large quantities and a variety of data into plain language understandable to others. But the role of the data analyst isn’t strictly tied to data “translation” and understanding. It’s more of a role that incorporates generating clear and advisory actions that have an immediate business impact. Within the context of NSoft, upon initial observation of the entire business process, we have soon realized the requirements for a data analyst and how one could contribute. This included familiarization with vast quantities of data, from numerous and various sources, combining, preparing and finally analyzing and extracting the value. What seems like a list of cursory terms is actually enormous effort that involves preparation, analysis, development, and usage of a system that enables BAS to govern valuable data and generate value for NSoft and it’s partners alike.

Importance of Turnkey

Striving towards common growth is the simple definition of what the Turnkey is. Coming to NSoft and adopting the Turnkey concept has made us completely change our mindset regarding the usage of product and user generated data. For NSoft the turnkey product solutions represent symbiosis with our partners with the common goal of mutual growth. Taking it all into consideration, through time, we have realized our mission statement has shifted: “We will not be providing data, we will provide the value!”

Data Analytics as a product aid

One of the ways through which data analytics provides value is by improving the existing product portfolio wherever possible. By providing services of statistical analysis, testing product logic, identifying user behavior as well as positive user feedback, we aim to implement our findings and insights into bettering each of our products.

Based on our analyses and observations, the numerous improvements, tweaks, upgrades have been made to our products which have led to greater recognizability of our products and far better player retention.

All products from NSoft’s rich portfolio receive the same treatment from our Analytics team in order to provide the best possible product to our partners and the best gaming experience to our players.

All of that combined sums to a sentence that best describes our aspirations:

  • Through data analysis, we aim to improve existing products and help to create new ones, to provide our partners with the better business results, give the players the best possible playing experience and maintain the highest standards our company has set up and insists on.

Data Analytics as a partner aid

Everything previously mentioned has resulted in the development of workflows capable of consuming and digesting this enormous amount of information which enables us to provide our partners with momentary information regarding anything of interest.

Additionally, by identifying the reasons for user churn we can provide our partners with actionable insights. Actionable insight is more than an information with context, it’s a prescription for a clear course of action. The final goal of those actions is an increase in partner’s revenue, as well as an increase in confidence in our information provisioning.

Our team has streamlined dashboard generation into an assortment of standardized report packages offered to all of our partners and in-house teams and individuals alike.

Today NSoft’s BAS team aims to improve existing products and help create new ones, to provide our partners with the better business results and give the users the best possible playing experience.

The technology that made it all possible

The technology stack involved was designed to be robust, easily scalable, adaptive and highly available and it required an enormous effort.

In our DWH implementation, we’ve chosen a streaming solution instead of ETL processes. As a starting point we should mention in-house operational data stores used for streaming sources:

Currently, from NoSQL databases, we have MongoDB as a source, and from SQL databases we’re currently using MySQL and PostgreSQL. Additionally, due to our partners’ specifics and requirements we are supporting various CSV, Google Sheets, SQLite, MS Excel data enrichments which can be blended at source or anywhere in the streaming pipeline seamlessly.

For data streaming Apache Kafka is used as a backbone and it’s accompanied by ZooKeeper, Kafka Connect, Kafka Streams, Kafka Schema Registry (as we’re using AVRO for primary streaming format) and Confluent kSQL.

Through streaming, pipeline data is enriched and transformed to ensure resulting data form compliance and quality and minimize data cleansing overhead.

Due to high-level data processing in the pipeline we’ve also decided to implement a caching layer using Redis.

For resulting DWH solution we’ve opted for Vertica, a leading Enterprise SQL Column Store Database solution, as our long-term store and analytics back-end, as well as opting for Elasticsearch for our near real-time data insights.

The Tableau is our primary information presentation driver, coupled with Superset, Metabase and Google Data Studio. Tableau was versatile enough to meet all the requirements we have had from an information distribution platform. Grafana and Kibana were our primary choices for visualization of near real-time data.

Programming languages in use are Java as a primary, Python as a secondary but as it usually happens, you need to expand your palette to deliver a good product so Bash, NodeJS, and PHP happen to be regular visitors of our team.

Monitoring needs are being effectively covered with Sensu for the systems part and Prometheus for service monitoring — ranging from availability checks to performance metrics.

Most of our stack is powered by Docker and due to stack size it was impossible to keep things smooth without orchestration solution, that’s where we decided to use Kubernetes with Rancher on-top for easier management.

Current CI/CD pipeline is custom implemented using Helm based Jenkins and Kubernetes publishing. It’s also worth mentioning that our K8S setup is expanded with Istio service mesh, MetalLB, fluentd, Rook and many other improvements compared to the standard issue of Kubernetes setup.

The entire setup is currently configured on bare-metal dedicated server machines and is horizontally and vertically scalable, as well as highly-available. The configuration of such stack led to high proficiency in DevOps department and faced us with some interesting challenges.

If you wish to be a part of our company and learn more about data science, find out more about us and apply for a job.

--

--