A common problem with image processing Rails apps has been the excessive memory consumption within the worker processes. When images are large enough, our worker will consume a big portion of our memory and will take forever to finish. If you are deploying on Heroku, you will probably be paying for the cheapest worker and the R14 notification emails will be flooding your inbox.
A few weeks ago, we discovered another AWS’ amazing product: Lambda. From their documentation, “AWS Lambda is a compute service that runs your code in response to events and automatically manages the compute resources for you, making it easy to build applications that respond quickly to new information”. In addition, Amazon has a very nice and easy to follow post on how to use AWS Lambda for image processing.
In this post, we will use as our starting point the post mentioned above to construct a solution that works with our Rails app.
Processing images using S3 + Lambda (AWS post summary)
The basic flow for image processing goes as follows:
- A user uploads an image to an S3 source bucket.
- The bucket emits an event which is listened by our AWS Lambda function.
- The function will then process the image and store it in a target bucket.
Why is it using two different buckets?
AWS S3 buckets can be configured to trigger events when a file is created. Since we will be creating new images from the original one, we do not want an event loop!
If you need further details on how to create an AWS Lambda function or any other part of the flow, go ahead and read the post. It is amazingly easy to follow!
What are we missing to make this work in a Rails app?
Wouldn’t it be perfect if instead of using two buckets we could use just one? Consider it done!
Additionally, if we want this process to be automated, the Rails app should be the one in charge of uploading the original image to the S3 bucket. Easy task if you are using Carrierwave!
Configure your S3 bucket
Open your S3 console and click on the bucket you want to use. Now open the properties tab on the top right corner and click on events. Configure an event in order to look like this:
The key part of this configuration is the prefix. This will allow us to trigger events only when an object is created in the uploads/originals/ folder. This is where we are going to upload our original images.
Note you will also need to change the proposed AWS Lambda function. The function is saving the processed files in the target bucket. We changed the function in the following way:
Rails app configuration
Add these two gems to your gemfile:
Then, configure carrierwave in the following way (place this file in config/initializers/carrierwave.rb):
Now generate an uploader by running the following command in the root of your project.
Change your uploader configuration so it looks like this one:
What are we doing here? Mainly two things:
- Telling carrierwave to store original files in uploads/originals and the different versions in uploads/resized/. This allows us to avoid the event loop, as events are only triggered when an object is created under uploads/originals/.
- Telling carrierwave to never generate the resized version. If you don’t do so, carrierwave will automatically upload a new file to /uploads/resized/… for each version you have declared. We don’t want that since we will be creating the versions with AWS Lambda.
That’s all there is to it!
When you upload a photo, it will be automatically sent to the S3 bucket under /uploads/originals/. Our Lambda function will be notified and put to work. When the process finishes, the resulting file will be placed in /uploads/resized/.
To check if the processing has finished, you can check if the version file exists by calling the resized.file.exists? method.
Posted by Matías De Santi (firstname.lastname@example.org)