remote sync (Rsync) files to AWS EC2

Cameron Bronstein
Feb 19, 2019 · 5 min read
Image for post
Image for post
Me, only weeks ago, looking into the great unknown: the cloud.

Without proper introduction, it can be surprisingly tricky to send large amounts of data from your local machine to a cloud server. Thankfully, open source tools like rsync exist to make this process easier. My goal is to introduce these tools to fellow developers and help others (and remind my future self) reduce the set-up time for remote, cloud-based projects.

I found rsync to be most useful when transferring nearly 60 GB of image data for an image analysis project I recently completed. With optional verbosity commands, you can get a nice print out to ensure everything is happening as you hope for. My favorite feature of rsync is that it will continue wherever a sync has terminated. This is especially helpful when syncing large directories with many files. If you internet connection drops, you can execute the same command without needed to keep track of the last synced file.

This post assumes you have some familiarity with AWS EC2 and know how to launch a new instance. If you are unfamiliar with AWS EC2, you can read more information in the Amazon docs.

Step 1: Visit AWS and ramp up your EC2 Instance

Once your instance is running, copy the public DNS key, and go to your local machine command line.

You can tunnel into the instances with:

ssh -i /path_to/your_private_key.pem/ server_root@public-DNS

If prompted, type yes, and you’ll be on the the remote server.

Step 2: Check available filesystem volume.

You can check for available space on the server with the df -h command in the command line. This will display and output like this:

ubuntu@ip-xxxxx:~/anaconda3$ df -h 
Filesystem Size Used Avail Use% Mounted on
udev 15G 0 15G 0%
/dev tmpfs 3.0G 8.9M 3.0G 1% /run
/dev/xvda1 73G 66G 7.6G 90% /
tmpfs 15G 0 15G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 15G 0 15G 0% /sys/fs/cgroup
/dev/loop1 17M 17M 0 100% /snap/amazon-ssm-agent/784
/dev/loop0 18M 18M 0 100% /snap/amazon-ssm-agent/930
/dev/loop2 91M 91M 0 100% /snap/core/6350
/dev/loop3 90M 90M 0 100% /snap/core/6130
/dev/loop4 88M 88M 0 100% /snap/core/5742
tmpfs 3.0G 0 3.0G 0% /run/user/1000
/dev/xvdf1 74G 15G 13G 84% /data

Here I can see that the bulk of my instances storage /dev/xvda1

is at 90% capacity.

However, I have another Elastic Storage Block attached to my instance, as we can see at the bottom of the output /dev/xvdf1.

Regardless of the architecture of your instance, you want to search for the main block storage and make sure there is available space. All of the tmpfs items refer to temporary storage (erased every time the instance is stopped).

Step 3: Check volume of specific files and directories, and remove unnecessary data.

For example, my servers /dev/xvda1 filesystem was had significant volume taken up by all of the native anaconda environments that came pre-installed on my server. The du-hs * command was a simple, handy way for me to locate unneeded files and free up space for more important data. I now incorporate this into my cloud-based workflow.

To remove files:

rm /path_to_file/filename

To remove directories:

rm -r /path_to_directory/directory_name

Step 4: Remote Sync!

If it is not installed, a simple `sudo apt install rsync` will work on your remote server.

With rsync installed, we are ready to go! From your local machine, execute the following command.

rsync -av — progress -e “ssh -i /path_to/your_public_key.pem/” /absolute_path_to/local_files/ remote_server_root@public_DNS:/absolut_path/remote_directory_destination

Breakdown of the command:

  • rsync: Hello, I want to use rsync!
  • av: This is the recursive rsync command.
  • — progress: This gives you a verbose print-out.
  • -e “ssh -i /path_to/your_public_key.pem/”: This gives you permissions to write to your remote instance. Depending on how you have configured you private key, you may not need this. Make sure the quotes are the true ascii character. I ran into trouble when copy-pasting the command from a notes app that converted the quote to a different character.
  • /absolute_path_to/local_files/: Including the trailing “/” makes a difference. Without the trailing slash the directory will be copied completely. With the trailing slash, the directory will not be copied, and all the files will be synced directly into the remote directory you specify.
  • remote_server_root@public_DNS: This is your remote server.
  • :/absolut_path/remote_directory_destination: And finally, the destination for your files on your remote server.

Any small error in the command or paths will result in an error. Please inspect your commands carefully before convincing yourself that rsync won’t work for you! It can be easy to have an error in such a long, path-y command!

Thanks for reading!

Image for post
Image for post

Follow us on Twitter 🐦 and Facebook 👥 and join our Facebook Group 💬.

To join our community Slack 🗣️ and read our weekly Faun topics 🗞️, click here⬇

Image for post
Image for post

If this post was helpful, please click the clap 👏 button below a few times to show your support for the author! ⬇


The Must-Read Publication for Creative Developers & DevOps Enthusiasts

Sign up for FAUN


Medium’s largest and most followed independent DevOps publication. Join thousands of aspiring developers and DevOps enthusiasts Take a look

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Cameron Bronstein

Written by

A plant ecologist at heart — using machine learning and remote sensing to better understand the biosphere.



The Must-Read Publication for Creative Developers & DevOps Enthusiasts. Medium’s largest DevOps publication.

Cameron Bronstein

Written by

A plant ecologist at heart — using machine learning and remote sensing to better understand the biosphere.



The Must-Read Publication for Creative Developers & DevOps Enthusiasts. Medium’s largest DevOps publication.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store