Deleting unwanted EFS files with a Lambda function

The Scenario

A student once wrote some amazing code. It was producing high quality data, but it was leaving behind unwanted files on the attached EFS access point. The AWS Lambda function with the code packaged into a Docker container via ECR. And, we didn’t quite have access to the code.

The Solution

Until we can fix the code, our solution was to trigger a new Lambda function that would delete all the unwanted files from EFS. Since our uneditable code was running every hour, we could trigger this new Lambda function to run immediately afterward. Here’s a super simple walkthrough at the AWS Console. DevOps gurus can probably do this faster with Terraform or CloudFormation, and if you are one of these people, hit us up with that script!

Gather Necessary Information

To start, we’ll need some information about our EFS access point to ensure a smooth configuration of our new Lambda function. In the AWS console,

A listing of File Systems from Elastic File System.
The Access Point for my EFS file system.
The availability zones, Subnet IDs and Security groups for my EFS file system.
The AWS VPC Service
The Security groups section of the VPC service.
Lookup the VPC ID associated to our security group IDs.

Create a new Lambda function

In the AWS console, go to Lambda and click ‘Create Function’. For the parameters, I entered the following:

  1. Author from scratch
  2. Function name: cleanup-efs
  3. Runtime: Python 3.8
  4. Architecture: x86_64
  5. Execution role: Create a new role with basic Lambda permissions
  6. Click: Create function
Steps 1–4. Creating a new Lambda function
Step 5 . Creating a new Lambda function — Create a new IAM role

Configure the Lambda function

Before we write the code, let’s configure the Lambda function with the appropriate EFS access point, VPC, and IAM privileges to complete the job. Immediately, after creating the function, click the Configuration tab.

The configuration tab for your new lambda function.
The role that was created from Step 5. Click this Role name to edit the permissions.

Configure the Lambda’s Execution Role

Clicking the Role name will take you to IAM > Roles where you can begin editing the permissions your Lambda function will need. Here you should already see one attached Permission Policy that starts with “AWSLambdaBasicExecutionRole-”.

Allow our Lambda to mount our EFS access point.

1. From the Role page, click Add permissions, and in the dropdown select Attach policies.

Searching for the AWS managed policy “AWSLambdaVPCAccessExecutionRole”
Selecting the ‘AWSLambdaVPCAccessExecutionRole’. Click ‘Attach policies’ to apply it to our role.

Allow our Lambda read/write access our EFS access point.

1. From the Role page, click Add permissions, and in the dropdown select Create inline policy.

The JSON tab from the Create policy page.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"elasticfilesystem:ClientRootAccess",
"elasticfilesystem:ClientWrite",
"elasticfilesystem:ClientMount"
],
"Resource": "*"
}
]
}

Configure the Lambda’s VPC

From your Lambda function’s overview page, click the VPC section under the Configuration tab.

The VPC configuration for our Lambda function.

Configure the Lambda’s File system

Now that we have a VPC setup for our Lambda function, we can now provide our function with the EFS access point. From your Lambda function’s overview page, click the File system section under the Configuration tab.

  1. Click Add file system to setup the EFS access point.
  2. Select the EFS file system. Then, select the Access point containing the files to be removed.
  3. Finally, determine a file system path to mount your EFS access point to for your Lambda code to use. For this example, I made my mount point: /mnt/test-data which we will use inside our code.
  4. Click Save.
Telling Lambda the path to mount the specified EFS file system access point to.

Write the Lambda function code

From your Lambda function’s overview page, click the Code tab.

The Code tab for your Lambda function.
import json
import logging
import os
logger = logging.getLogger()
logger.setLevel(logging.INFO)
print('Loading function')def lambda_handler(event, context):

basepath = '/mnt/test-data'
files = []
directories = []
for root, dirs, files in os.walk(basepath):
for dir in dirs:
dirpath = os.path.join(root, dir)
logger.info("Found directory: " + dirpath)
directories.append(dirpath)

for name in files:
filepath = os.path.join(root, name)
if filepath not in files:
files.append(filepath)
os.remove(filepath)
logger.info("Removed file: " + filepath)
directories.reverse()

for directory in directories:
os.rmdir(directory)
logger.info("Removed directory: " + directory)

return {
'statusCode': 200,
'body': files
}
Click Deploy.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adam Shepherd

Adam Shepherd

1 Follower

Works in technology at the Woods Hole Oceanographic Institution. Most interested in Knowledge Graphs. Desperately trying to figure out cloud computing.