AWS S3 bucket: bulk copy or rename files from Windows

Guillaume Prévost
4 min readSep 7, 2019

--

Recently we have had the need at Friend Theory to bulk move, copy multiple files at once on our AWS S3 buckets, based on a specific renaming pattern. Here’s the approach I’ve used and how I did it.

1. Install and Configure AWS CLI

Install AWS CLI following the instructions on the link below:

You can check your installation succeeded by running:

$ aws --version 

Run the initial configuration to allow the CLI to connect to your AWS account:

$ aws configure
AWS Access Key ID [None]: SGFJHCGHDUHGKJ84EXAMPLE
AWS Secret Access Key [None]: QWERTYUIOPASDFGHJKLEXAMPLE
Default region name [None]: us-west-2
Default output format [None]: text

(the access key ID and secret come from an IAM User identity, these credentials are created on AWS console. Usually you would already have one or more already created, or you can create a new one just for this).

More info here: https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html

2. Moving and renaming single files on AWS S3

The aws s3 copy and move commands work more or less like the native UNIX mv and cp commands:

$> aws s3 cp s3://bucket-name/path/to/source-file.ext destination-file.ext$> aws s3 mv s3://bucket-name/path/to/source-file.ext destination-file.ext

3. Script to bulk rename files on AWS S3 bucket

Without further ado, here’s the script I ended up using (details and explanations below so you can adapt it to your needs):

$> aws s3api list-objects --bucket friend-theory-dev --prefix "test/profile-pictures/" --delimiter "/" | ForEach-Object { $_.split("`t")[2] } | Select-String -Pattern 100x100.jpg | ForEach-Object -Process {$outputFile = $_ -replace '100x100', 'sm'; $outputFile = $outputFile -replace '/profile-pictures', '/users-pictures'; aws s3 cp s3://friend-theory-dev/$_  s3://friend-theory-dev/$outputFile }

Let’s take this piece by piece:

Listing objects:

$> aws s3api list-objects --bucket sample-bucket --prefix “folderA” --delimiter “/”
OWNER hello 123456789
CONTENTS “1234567890” folderA/11111–100x100.jpg 2019–06–17T16:29:39.000Z 44193 STANDARD
OWNER hello 123456789
CONTENTS “1234567890” folderA/11111–500x500.jpg 2019–06–17T16:29:39.000Z 44193 STANDARD
OWNER hello 123456789
CONTENTS “1234567890” folderA/22222–100x100.jpg 2019–06–17T16:29:39.000Z 50071 STANDARD
...

This first part uses the lower-level aws s3api list-objects which outputs a list of AWS objects.

  • The --bucket parameter specifies the name of the bucket
  • The --prefix parameter specifies the path within the bucket (folder). The delimiter “/” is there to prevent recursion if there were folders within folderA.
  • Note: if the default output format of your AWS CLI configuration is JSON, you will have to add an extra parameter — output text to ask for a text output.

Splitting the output:

As usual in UNIX or PowerShell, we use the pipe “|” to pass the output of one command and input for the following one:

| ForEach-Object { $_.split(“`t”)[2] }

For each of the “Objects” returned (i.e. each line of the text output), we apply the split function to $_ String, the $ sign referencing a variable and the underscore being the default name of the input passed to the function.
So we apply a split function to each line of the previous output, splitting on the TAB character `t (an escaped “t” character, equivalent to a \t).
Finally, we take the index 2 of that split, which means the name of the file.

Output:

folderA/11111–100x100.jpg
folderA/11111–500x500.jpg
folderA/22222–100x100.jpg
...

Matching with a Pattern

This part is optional. You don’t need it if you want to move or copy ALL of the files in your folder.

We use the PowerShell Select-String function (~ like a grep in UNIX) to match the lines that contain the pattern we need. In this case it’s pretty simple, every file that contains “100x100.jpg” will be matched.

| Select-String -Pattern 100x100.jpg

But Select-String can be quite powerful. See the full documentation.

Rename the files

The last part of the script is again running the ForEach-Object function on the previous output, a runs a process onto it, that will calculate the output path and file:

| ForEach-Object -Process {$outputFile = $_ -replace ‘100x100’, ‘sm’; $outputFile = $outputFile -replace ‘/folderA’, ‘/folderB’; aws s3 cp s3://sample-bucket/$_ s3://sample-bucket/$outputFile }

The first part is an assignation: the initial input $_ we replace the characters “100x100” by “sm” (that was my requirement, this is where you replace with your own renaming rule) and assign this to the variable $outputFile.
This part can be removed if you don’t need the files to be renamed but only moved or copied to another folder.

$outputFile = $_ -replace ‘100x100’, ‘sm’;
folderA/11111–100x100.jpg is replaced by folderA/11111-sm.jpg

The second part is another assignation: in the previous variable we replace “folderA/” by “folderB/”. This part can be removed if you don’t need to change folder but only rename files.

$outputFile = $outputFile -replace ‘/folderA’, ‘/folderB’;
folderA/11111-sm.jpg becomes folderB/11111-sm.jpg

Finally, the last part is making a call to AWS S3 API to copy the file from its initial path and name in $_ variable, to the destination one in $outputFile

aws s3 cp s3://sample-bucket/$_ s3://sample-bucket/$outputFile

If you need to move instead of copy the files, you just need to change cp into mv .

THAT’S IT !

All in all, this combines many different concepts, but makes for a powerful and versatile command that can achieve a lot of maintenance on AWS S3 buckets directly from your Windows machine.

Credit: Gerard Vivancos for his article on how to do this under UNIX, which I used as a base to do this under Windows (http://gerardvivancos.com/2016/04/12/Single-and-Bulk-Renaming-of-Objects-in-Amazon-S3/).

--

--

Guillaume Prévost

Software Engineer — Entrepreneur — Photographer — Traveler