ETL (Extract, Transform, Load) is a way to load data from a source that can be a file, database, transform it, and then load it into a target database or files.
Designing an ETL solution is not an easy task, it has many minor nuisances that we need to take care of, the idea here is to come up with a serverless ETL solution where we pay only for the resources we use during processing.
Our requirement was to load data from thousands of different CSV and XML files and persist into database tables scheduled monthly /quarterly along with the…
Blockchain is getting adopted widely in different domains and the reason behind is that it solves the problem of mistrust, control, and authenticity by providing
It is being used to solve these problems in a wide variety of use cases like healthcare, supply chain management, insurance, payments, review systems, and much more. …
Logging, Alerting, and Monitoring are one of the key components of the software life cycle. Having an effective alerting and monitoring tool improves system performance and productivity and helps you reduce (or even eliminate) downtime. It can help you identify and fix issues faster, minimizing the impact on your customer and business. It serves to improve visibility into your system and helps you to plan time, resources for your future project better.
One of my blog posts about streaming cloud watch logs to elk has got more than 20k views and many people have reached out to me asking various…
To compete with the best in the world, every organization aims at improving its developer productivity and wants their team to achieve high quality and high velocity of work.
We, as an organization too, assure our clients to deliver high quality and high velocity. Data empower us to identify gaps and arrive at root cause so we can fix our future performance relative to our past performance.
Therefore, we wanted to come up with a set of metrics that can be monitored to measure our productivity and identify the areas that can be improved.
How do we come with such…
When I started my career I was asked to write test cases for my code. Like everyone or at least like most developers, I also thought why do I need to write test cases if I have tested my code. As time passed with the experience, I got the hang of writing test cases but still had few questions unanswered.
Fast forward a few years I read a few books like Clean Architecture, Building Microservices, and DevOps Handbook. One thing common in these were they all spoke about writing test cases.
At Nggawe Nirman, I had time to set up…
While trying to setup Nginx as a reverse proxy with GRPC, I had to spend a few hours to go through the GRPC, NGINX tutorial to figure out the process and make it work.
With this article of mine, I would like to help anyone to set it up quickly.
In this setup where Nginx acts as a reverse proxy, client makes a call to Nginx at port localhost:1449 which then routes the request to the server running at port localhost:1338.
Generate the certificates to establish secure connection (ssl)
openssl req -newkey rsa:2048 -nodes -keyout server.key …
For one of the projects at Nggawe Nirman, we were having multiple services in microservices architecture and we were using API gateway and lambda for our services.
We were having our services in Monorepo, and deployment was done by automatically zipping the artifacts and pushing it to s3, but as we started using different features for API gateway, lambda and other services the setup task became manual and started taking too much time that is when we decided to use Serverless framework for our deployments.
As we were having our projects in Monorepo, our project structure looked something like this.
What is Availability?
Availability is the ability of a system to be available for use after a fault occurs.
Availability is the capability of a system to repair failures so that the cumulative service outage period does not exceed a given time.
How do we measure Availability?
Availability= MTBF / (MTBF+MTTR)
MTBF=Mean time between failure
MTTR= Mean time to recover
What is fault in a highly available system?
In any highly available system fault can be defined as crash,incorrect timing, incorrect response, omission.
Ping: sync/aysnc message pair exchange between nodes, detects if…
Tracing a request from start to end is critical for diagnosing issues quickly. This becomes difficult in a highly concurrent system because logs are interspersed with each other. It becomes even harder in the microservices architecture where a request may travel through multiple services before being complete.
In order to diagnose issues in such an environment, we need to have a unique identifier to tie up all the logs through different services. Casualty helps you do exactly that.Through Casualty the requests can be stitched together from start to end using multiple services.You …
To increase the build’s performance, you can exclude files and directories by adding a
.dockerignore file .
Each instruction in the Dockerfile adds an extra layer to the docker image.
The number of instructions and layers should be kept to a minimum as this ultimately affects build performance and time.
Docker creates a layer on top of existing layer for each instruction in docker file ,and caches it.When you re-run the docker build command it searches for the layer in the cache if its there it uses the cached layer otherwise cache is invalidated and all the layers after that…