Siva ChaitanyaManageable Airflow DAGs for Google ComposerWorkflow Management Systems have been around for quite sometime now. Apache-Airflow is the one of the most used and the reasons for its…5 min read·Dec 13, 2019----
Siva ChaitanyaNumba — Making Numpy 10x FasterPython is slow, as it is interpreted at run-time and not compiled to native code. People have tried to get a compiler for Python for quite…3 min read·Oct 14, 2018----
Siva ChaitanyaAWS Lambda — Starting and Stopping RDS Instances (Python)While the production databases need to be up and running 24x7, we had a few AWS RDS instances that were not used by teams during the night…2 min read·Jul 4, 2018--2--2
Siva ChaitanyaAWS Lambda — Starting and Stopping EC2 Instances (Python)Costs if not monitored, can escalate very quickly when using cloud architecture. We use many EC2 instances for non-production use-cases and…2 min read·Jul 3, 2018----
Siva ChaitanyaAccessing AWS S3 from PySpark Standalone ClusterBefore you proceed, ensure that you have installed and configured PySpark and Hadoop correctly. To cross-check, you can visit this link…2 min read·May 22, 2018--4--4
Siva ChaitanyaInstall Apache Spark (pyspark) — Standalone modeWhile there is a lot of documentation around how to use spark, I could not find a post which could help me install Apache Spark from…2 min read·May 18, 2018----