Sagar GangurdeinData EngineeringHow to use Elasticsearch as Vector DatabaseLets setup single node Elasticsearch cluster on local machine.Mar 19Mar 19
Sagar GangurdeinData EngineeringAdding GCS support to EMREver wondered how can we read from or write to Google Cloud Storage(GCS) from AWS EMR!Mar 12, 2023Mar 12, 2023
Sagar GangurdeinData EngineeringLoading large size compressed files in BigQuery (BQ)BQ load jobs have the size limit of 4GB for a compressed CSV file. If we try to load > 4GB compressed CSV files in BQ, we get the limit…Mar 9, 2023Mar 9, 2023
Sagar GangurdeinData EngineeringContainerisation of Java applicationsContainerisation is the process of creating a bundle of application code and all its dependancies.Sep 15, 2022Sep 15, 2022
Sagar GangurdeinData EngineeringGitlab CI/CD using specific runners on AWS EC2GitLab CI/CD can automatically build, test and deploy our applications. Runners are processes that pick up and execute CI/CD jobs for…Sep 15, 2022Sep 15, 2022
Sagar GangurdeinData EngineeringManage your secrets inside containers using AWS SSMLet’s say, we are working on a python application which is running inside a docker container and it needs access to database hosted on AWS…Sep 11, 2022Sep 11, 2022
Sagar GangurdeinData EngineeringHandling different file formats with PysparkSpark support many file formats. In this article we are going to cover following file formats:Mar 14, 2022Mar 14, 2022