Siva ChaitanyaManageable Airflow DAGs for Google ComposerWorkflow Management Systems have been around for quite sometime now. Apache-Airflow is the one of the most used and the reasons for its…Dec 13, 2019Dec 13, 2019
Siva ChaitanyaNumba — Making Numpy 10x FasterPython is slow, as it is interpreted at run-time and not compiled to native code. People have tried to get a compiler for Python for quite…Oct 14, 2018Oct 14, 2018
Siva ChaitanyaAWS Lambda — Starting and Stopping RDS Instances (Python)While the production databases need to be up and running 24x7, we had a few AWS RDS instances that were not used by teams during the night…Jul 4, 20182Jul 4, 20182
Siva ChaitanyaAWS Lambda — Starting and Stopping EC2 Instances (Python)Costs if not monitored, can escalate very quickly when using cloud architecture. We use many EC2 instances for non-production use-cases and…Jul 3, 2018Jul 3, 2018
Siva ChaitanyaAccessing AWS S3 from PySpark Standalone ClusterBefore you proceed, ensure that you have installed and configured PySpark and Hadoop correctly. To cross-check, you can visit this link…May 22, 20184May 22, 20184
Siva ChaitanyaInstall Apache Spark (pyspark) — Standalone modeWhile there is a lot of documentation around how to use spark, I could not find a post which could help me install Apache Spark from…May 18, 2018May 18, 2018