ETL (Extract, Transform, Load) is a way to load data from a source that can be a file, database, transform it, and then load it into a target database or files.

Image for post
Image for post

Designing an ETL solution is not an easy task, it has many minor nuisances that we need to take care of, the idea here is to come up with a serverless ETL solution where we pay only for the resources we use during processing.

Our requirement was to load data from thousands of different CSV and XML files and persist into database tables scheduled monthly /quarterly along with the…


Image for post
Image for post

Blockchain is getting adopted widely in different domains and the reason behind is that it solves the problem of mistrust, control, and authenticity by providing

  • Trust
  • Decentralization
  • Immutability

It is being used to solve these problems in a wide variety of use cases like healthcare, supply chain management, insurance, payments, review systems, and much more. …


Logging, Alerting, and Monitoring are one of the key components of the software life cycle. Having an effective alerting and monitoring tool improves system performance and productivity and helps you reduce (or even eliminate) downtime. It can help you identify and fix issues faster, minimizing the impact on your customer and business. It serves to improve visibility into your system and helps you to plan time, resources for your future project better.

One of my blog posts about streaming cloud watch logs to elk has got more than 20k views and many people have reached out to me asking various…


Image for post
Image for post

To compete with the best in the world, every organization aims at improving its developer productivity and wants their team to achieve high quality and high velocity of work.

We, as an organization too, assure our clients to deliver high quality and high velocity. Data empower us to identify gaps and arrive at root cause so we can fix our future performance relative to our past performance.

Therefore, we wanted to come up with a set of metrics that can be monitored to measure our productivity and identify the areas that can be improved.

How do we come with such…


When I started my career I was asked to write test cases for my code. Like everyone or at least like most developers, I also thought why do I need to write test cases if I have tested my code. As time passed with the experience, I got the hang of writing test cases but still had few questions unanswered.

Image for post
Image for post
Undraw.co

Fast forward a few years I read a few books like Clean Architecture, Building Microservices, and DevOps Handbook. One thing common in these were they all spoke about writing test cases.

At Nggawe Nirman, I had time to set up…


While trying to setup Nginx as a reverse proxy with GRPC, I had to spend a few hours to go through the GRPC, NGINX tutorial to figure out the process and make it work.

With this article of mine, I would like to help anyone to set it up quickly.

Image for post
Image for post
Nginx acting as a reverse proxy (Source:Nginx)

In this setup where Nginx acts as a reverse proxy, client makes a call to Nginx at port localhost:1449 which then routes the request to the server running at port localhost:1338.

Generate the certificates to establish secure connection (ssl)

openssl req -newkey rsa:2048 -nodes -keyout server.key …


Image for post
Image for post

For one of the projects at Nggawe Nirman, we were having multiple services in microservices architecture and we were using API gateway and lambda for our services.

We were having our services in Monorepo, and deployment was done by automatically zipping the artifacts and pushing it to s3, but as we started using different features for API gateway, lambda and other services the setup task became manual and started taking too much time that is when we decided to use Serverless framework for our deployments.

As we were having our projects in Monorepo, our project structure looked something like this.


What is Availability?

Availability is the ability of a system to be available for use after a fault occurs.

or

Availability is the capability of a system to repair failures so that the cumulative service outage period does not exceed a given time.

How do we measure Availability?

Image for post
Image for post

Availability= MTBF / (MTBF+MTTR)

MTBF=Mean time between failure

MTTR= Mean time to recover

What is fault in a highly available system?

In any highly available system fault can be defined as crash,incorrect timing, incorrect response, omission.

Tactics that can be applied to achieve highly available system.

  • Fault detection
  • Fault prevention
  • Recovery from Fault

Fault Detection

Ping: sync/aysnc message pair exchange between nodes, detects if…


Image for post
Image for post

Tracing a request from start to end is critical for diagnosing issues quickly. This becomes difficult in a highly concurrent system because logs are interspersed with each other. It becomes even harder in the microservices architecture where a request may travel through multiple services before being complete.

In order to diagnose issues in such an environment, we need to have a unique identifier to tie up all the logs through different services. Casualty helps you do exactly that.Through Casualty the requests can be stitched together from start to end using multiple services.You …


Image for post
Image for post

Use a .dockerignore file

To increase the build’s performance, you can exclude files and directories by adding a .dockerignore file .

Minimize the number of layers / Consolidate instructions

Each instruction in the Dockerfile adds an extra layer to the docker image.
The number of instructions and layers should be kept to a minimum as this ultimately affects build performance and time.

Use COPY command instead of ADD

Avoid installing unnecessary packages

Take advantage of docker cache to reduce the build time

Docker creates a layer on top of existing layer for each instruction in docker file ,and caches it.When you re-run the docker build command it searches for the layer in the cache if its there it uses the cached layer otherwise cache is invalidated and all the layers after that…

Sohit kumar

Swiss Knife, solves problems, building tech platforms. Follow me for intresting tech articles. https://codeshots.in

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store