Building Serverless Data Lake on AWS with Terraform

No servers, no problems

Sergi Lehkyi
Geek Culture

--

Thanks for photo Zetong Li from Unsplash

It’s been a long time since my last article, but finally I have something to share.

Last two months were real hell for me — a short-term project that supposed to be easy and quick ended up as always — with famous quote about programmer’s credo: “We do these things not because they are easy, but because we thought they were going to be easy”.

The project was to build a data lake from the scratch with all the freedom of actions in order to find the best solution. As it is 2021 outside and modern problems require modern solutions — serverless data lake is hell good of an idea.

Security

As we are dealing with data our first concern should be security. In order to create a secure cloud environment it is recommended to use Operational Best Practices for CIS AWS Foundations Benchmark v1.3 Level 2 and implement all of them (Perfect scenario). For one time evaluation of the solution implementation Prowler score is a good indicator (Prowler is a security tool to perform AWS security best practices assessments, audits, incident response, continuous monitoring, hardening and forensics readiness). Prowler is a really great open source tool and I recommend you to check your cloud environment with it (or you can set up AWS…

--

--

Sergi Lehkyi
Geek Culture

Data and Cloud Developer, love technology in general, maybe too much humor and never too serious, based in amazing Barcelona