Data Engineering Internship at Lacework

Shivananda D
4 min readAug 17, 2023

How could I not have a reflection article on my first job in the United States? It was a dream come true to work for a company located in Silicon Valley and experience corporate life all over again. In this blog post, I want to share the story of my Summer ‘23 internship at Lacework, which is a data-driven cloud security company. I totally enjoyed working in a developing company and reflecting on the journey, I’ll talk about the overall working experience, my internship project — including the data engineering challenges I faced and things achieved, and the invaluable lessons that I’ll carry forward from this internship.

Source: Company’s website

Overall Experience

Having previously worked at a large enterprise, this time, I was glad to have got the opportunity to work at Lacework, which belongs to the developing startup space. Here, I observed that collaboration is at the heart of everything, and this environment made it easy for me to work alongside colleagues from different teams during my internship. I also observed the emphasis on work-life balance which I enjoyed the most. Also, there were plenty of benefits for interns in terms of employee well-being :)

The internship program was very well-structured and provided a solid framework for learning and growth. Within my data engineering team, I was fortunate to have a great supportive manager and a mentor. Their guidance and feedback played a pivotal role in keeping me on track to achieve the desired objectives.

As Lacework is still in its growth phase, it has a relatively small employee count, and this extends to the data engineering team as well. This setup has several advantages. The first one is any meaningful contributions you make garner significant visibility across the organization. And the second advantage is that the data engineering team has a dual role — building data products while maintaining the data platform. This structure offers a holistic understanding of data engineering processes, from the foundational infrastructure to the creation of analytical solutions.

The Internship Project

Lacework offers cloud security solutions, all conveniently available through SaaS products. Customers can interact with the Lacework platform in multiple ways and my project’s goal was to build core datasets that would help analyze why and how customers are using the Lacework platform. The datasets would help in building insights on customer adoption and feature usage, ultimately serving as a guide to making decisions on future product improvements. However, the project had its challenges, and these accurately represent the complexities that a data engineer deals with apart from the technical task of building data pipelines.

Firstly, grasping the domain knowledge was paramount. This involved comprehending the Lacework platform’s architecture and how customers practically use and engage with it. Next, an observability tool was used to keep track of customer usage logs and I had to gain additional knowledge on that tool as well during the initial project phase. Investigating the tool unraveled some more challenges, such as limited documentation on source datasets, missing key attributes, and additional formatting needed to retrieve meaningful information from source datasets. Eventually, it needed a lot of collaboration with the platform engineering (frontend and backend) and data science teams to address these challenges, ensuring that all the essential data was being captured and at the right granularity for building core datasets.

Another challenge was in terms of sourcing data from the observability tool. The tool’s data query API had specific constraints, including the frequency of API calls per minute and the amount of data that could be processed in a single API call. These limitations had implications for the project’s scalability. And this was tackled programmatically way through the dynamic generation of API calls and advanced data querying techniques provided by the observability tool itself.

Overall, the project was completed through a well-thought process consisting of five phases: Requirements Analysis and Data Discovery, Data Modeling, Pipeline Design and Implementation, Testing, and Documentation. An Airflow pipeline was built and deployed to production in a cloud-native AWS environment to automatically refresh the dataset, keeping it up-to-date for analysis. Moreover, data quality checks were incorporated to validate data integrity and reliability, ensuring data completeness and accuracy for the end users.

Key Takeaways

My internship provided immense learning opportunities, allowing me to enhance my data engineering skills both technically and professionally. From a technical standpoint, I was able to improve my data modeling skills and gained practical experience with tools like Airflow and Snowflake within an AWS cloud-native environment.

Given the nature of the internship which is often constrained by time, I learned that prioritization of tasks is crucial to getting my focus on the things that mattered and delivering key objectives that had the most impact. Moreover, I was reminded that data engineering is not just about technical work. Skills that help us deal with the ambiguity in requirements and data challenges are equally essential to succeeding as a data engineer.

Additionally, I recognized that the nature of data engineering work can vary based on the team’s dynamics. Unlike my previous experience as a data engineer, in this internship, collaboration and proficiency in data discovery proved to be vital for me. This emphasized the role of teamwork and adaptability in achieving data engineering goals.

This brings us to the end of this article. From dealing with various data engineering challenges to experiencing a vibrant work culture, this internship has been a profound learning curve, helping me develop new skills and gain a deeper understanding of data engineering.

Opinions are my own and the insights shared in this blog post are based on my personal experiences and observations.

--

--