Uploading a CSV to AWS S3 in Rails and using the file

Recently we were dealing with an issue for a customer (we’ll call them customer X) that has been with us for several years. They have hundreds of thousands of records in our system and we are starting to deal with issues of scale.

For a typical customer that isn’t customer X, they could simply go to a dashboard we had created, click on a button and a CSV report downloads straight from the browser with a few hundred records. For customer X this simply wouldn’t work because the calculations in the backend took too long for the browser and often ended up timing out. It was a simple problem to solve with a background job which is exactly what we did a few years ago.

We implemented a background job to run the query and when ready would allow the user to download the report from an email we had sent them. It was a perfect solution for customer X until another problem arose. Here is the email method that was performed and put in a background job:

def email_sales_report(email)
@report = Reports::Sales.new(report_params)
attachments['sales_report.csv'] = @report.csv
  mail to: email, subject: "Email Sales Report #{Time.now}"
end

This looks fine and dandy right?


We thought it did too until one day we started receiving errors about this email report not being able to be sent. After a few hours of digging and irrelevant error messages, we found out that the email server could not handle attachments of the size we were trying to use.

In order to make this report work again, we had to come up with a solution where we could host the CSV and link to it from the email. AWS S3 which most of you should know is “simple storage service” allowing you to store documents, files and more was the perfect solution.

Here are the basic steps we took to set this up in our Rails application:

1. Adding AWS SDK to gemfile and bundle install

Just for this tutorial, if you don’t already have AWS-SDK within your Rails repo, go ahead and add it and then bundle install.

gem 'aws-sdk'

2. Setting up initializer

First we setup an initializer to get S3 working. We set this up like so in a file we created — /config/initializers/amazon_s3.rb

We set this as a global variable that could be accessed at any point in our rails application through AMAZON_S3_CLIENT

AMAZON_S3_CLIENT = Aws::S3::Resource.new(region: 'us-west-1',
access_key_id: 'XXXXXXXXXXXXXXX',
secret_access_key: 'XXXXXXXXXXXXXXXXXXXXXXXXX'
)

3. Creating Class Command to instantiate S3 Upload

We utilize commands throughout our app that are put into background jobs like the one we have been talking about with the reports. We created another command for uploading our CSV’s to S3. This is how we set it up below in a file named commands/reports/upload_to_s3.rb:

class Report::UploadToS3 < Command::Base
attribute :csv
action do
self.set_aws_bucket
obj = AMAZON_S3_CLIENT.bucket(@bucket_path).object("#{Time.now}/report.csv")
obj.put(body: csv, acl: 'public-read', content_disposition: 'attachment')
return obj.public_url
end
def set_aws_bucket
case Rails.env
when 'production' then
@bucket_path = 'csv'
when 'staging' then
@bucket_path = 'staging.csv'
else
@bucket_path = 'dev.csv'
end
end
end

There are a few things I’d like to point out. Notice that we use a different bucket path depending on the environment that we are working on, we don’t want to mix and match csv uploads.

Next — you’ll see that Time.now is included in the string for saving the file name in S3. This will create a folder called “01/01/2019 18:00:00” for example and within that the actual file. This was just to show that you can create subfolders.

In our actual app we do some complex things to create folders for individual users so that each user has their own folder and we can save all the reports they have been downloading. You should be able to figure out ways to do this on your own but it is highly recommended you do so and in a secure manner not using any common identifiers like user_id and things of that nature.

And lastly, you’ll notice content_disposition: 'attachment' is vitally important as it allows the CSV to be downloaded and acl: 'public-read' is important too because it makes your CSV’s viewable for your customers. The URL for this file is returned using public_url instance method which can be seen here: https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/S3/Object.html#public_url-instance_method

4. Invoking the command and retrieving the URL

Now this is what our email_sales_report method looks like :)

def email_sales_report(email)
@report = Reports::Sales.new(report_params)

@s3_public_url = Report::UploadToS3.new(csv: @report.csv).execute
mail to: email, subject: "Email Sales Report #{Time.now}"
end

@s3_public_url can be used in the email template for this method like so:

<p>Your all attendee ticket list is attached. Please click on the link below:</p>
<a href="#{@s3_public_url}" download>Download Report</a>

Now you can send a file as you want over any email server. This has helped us immensely not only send out large reports to our customers, but it has also helped us to see which customers are downloading reports and how frequently they are downloading them.


Hopefully this has helped you out and is easy to follow along. If you have any questions or need any help uploading files to S3 and using those files throughout your rails app — shoot me an email at luke.will.duncan@gmail.com