How to Stream Files From AWS S3 to IPFS

Jarrod Eyler
Pinata
Published in
4 min readNov 13, 2020

IPFS is the Swiss army knife of distributed storage technologies. Its uses and benefits are numerous and have been covered by other blog posts. So, I won’t rehash any of that here (pun intended).

A common use case for IPFS is for a client application to upload and “pin” data in the form of files (images, documents, video, json, etc) to the network and get back a hash of the data that can be referenced elsewhere on the IPFS network. Pinning is a way of telling an IPFS node that the pinned data is important and shouldn’t be removed from the IPFS node when it performs garbage collection. You can only pin data to nodes you control. So, if you have content you want to stick around permanently you either have to spin up and maintain your own IPFS nodes or use a pinning service.

We often hear from our users the desire to stream and pin large files from their various storage systems, S3 in this case, without having to first download the entire file to their file system.

The following example will show how a file can be streamed using Node.js from S3 to multiple endpoints without having to write the file to disk.

Assumptions

  • You are familiar with Node.js and NPM
  • You have an AWS account with programatic access to S3

Step 1 — Upload a Test File to S3

Upload a test file to your S3 bucket and keep track of the key.

Step 2 — Set Up an IPFS Node

In order to Pin data you’ll need to set up and manage your own IPFS node(s). There are existing tutorials for this step so I won’t cover that process here.

Step 3 — Set Up a Simple Node.js Project

On your local machine, set up a Node.js project with a simple index.js file and be prepared to run it from the command line.

Step 4 — Get a Readable Stream From S3

Here we are using the AWS S3 SDK to get an object and readable stream from S3. You’ll need to fill in the variables for S3 with your own AWS account info including s3AccessKeyId, s3AccessSecret, s3Region, and s3Bucket.

Step 5 a— Add File to Your Own IPFS Nodes

Use the stream from S3 to add the file to IPFS and get back a hash. The code for this example is below in full.

Step 5b — Using Pinata Rather Than Your Own Nodes

You can definitely host, maintain, secure, patch, upgrade, and network your own IPFS nodes. You can also do your own dental surgery. Both of these DIY approaches can be quite painful. Instead, you can use Pinata for an easy way to stream files to IPFS. The following example shows how that would work.

Assumptions

  • You like making your life easy
  • You have a Pinata account
  • You don’t need your own IPFS nodes, so skip Step 2 above

In this example we will use the form-data library to create a multipart form to upload the readable stream we got from S3 to the Pinata API. We are using Axios to handle the http POST, so you’ll also create a config object that specifies the details for the file upload.

Take notice of a couple important details.

First, the filename on the form.append call. This is necessary because unlike when getting a stream directly from a file, the filename isn’t automatically provided when streaming a file from S3. This can be a tricky detail that gets in the way when streaming from S3.

Also notice that the maxBodyLength value on the Axios request is set to Infinity, otherwise Axios will error out on larger files.

You’ll need to provide your pinata_api_key and pinata_secret_api_key in the header of the request. Don’t have these? Sign up for a Pinata account right here.

And last, take a look at the json response data from the Pinata API providing you the file hash (partially redacted), pin size, and timestamp.

Summary

With this example you should be able to stream a file from S3 to either your own IPFS node(s) or the Pinata API without having to persist all of that file data on your server. Thereby, you can improve the scalability of your application.

Happy Pinning!

--

--