A Comprehensive Guide for transferring Amazon S3 bucket Data across AWS Accounts

Suraj Subedi

Published in

readytowork, Inc.

6 min readJul 1, 2024

Overview

Brief intro of AWS S3
Importance and use cases of data migration between accounts and regions
A step-by-step guide with things to consider along with permissions and costing.

Prerequisites

AWS accounts setup ( One source and one destination account)
IAM roles and permissions ( Enough permissions in case of non-root users)

Introduction

Amazon S3 (Simple Storage Service) is a scalable object storage service that allows you to store and retrieve data from anywhere. It offers high durability, security, and flexible data management features, making it ideal for backup, archiving, big data analytics, and content distribution.

Importance:

Cost Optimization: Move data to regions with lower storage costs or transfer it to different accounts for better cost management.
Disaster Recovery: Ensure data redundancy and recovery options by storing copies in multiple regions.
Compliance and Data Sovereignty: Meet regulatory requirements by keeping data within specific geographic boundaries.
Performance Optimization: Reduce latency by placing data closer to users or applications.

Use Cases:

Mergers and Acquisitions: Consolidate data from different accounts.
Geographic Expansion: Replicate data across regions to support global operations.
Cross-Account Collaboration: Share data securely between different business units or partners.
Data Lifecycle Management: Transfer aging data to different accounts for archival purposes.

For more information, visit the AWS S3 documentation.

Step-by-Step Migration Guide

Step 1: In Your Source Account, Create a DataSync IAM Role for Destination Bucket Access

Create an IAM role with policies allowing DataSync tasks and S3 read access.

Goto: https://console.aws.amazon.com/iam/

Create new role
Select DataSync for service then it will attach the required permissions
Continue and give your role a name then create the role.

Your config should like this then after creating it you can start using it.

Step 2: In Your Destination Account, Update Your S3 Bucket Policy

Update the destination bucket policy to allow the DataSync role from the source account, otherwise source bucket will not be able to access the destination bucket. To achieve this, update the bucket policy to include following policy:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Sid": "DataSyncCreateS3LocationAndTaskAccess",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::source-account:role/source-datasync-role"
      },
      "Action": [
        "s3:GetBucketLocation",
        "s3:ListBucket",
        "s3:ListBucketMultipartUploads",
        "s3:AbortMultipartUpload",
        "s3:DeleteObject",
        "s3:GetObject",
        "s3:ListMultipartUploadParts",
        "s3:PutObject",
        "s3:GetObjectTagging",
        "s3:PutObjectTagging"
      ],
      "Resource": [
        "arn:aws:s3:::destination-bucket",
        "arn:aws:s3:::destination-bucket/*"
      ]
    }
  ]
}

Here, now replace the following:

source-account -> Source account id , which you can find hovering at the profile in the top right corner.

source-datasync-role -> role-name we created in Step 1

destination-bucket -> destination bucket name

Step 3: In Your Destination Account, Disable ACLs for Your S3 Bucket

Disable ACLs for your S3 bucket on destination bucket settings.

On the bucket’s detail page, choose the Permissions tab.
Under Object Ownership, choose Edit.
If it isn’t already selected, choose the ACLs disabled (recommended) option.

Alternatively, you can run the following command in CloudShell to achieve this. Here replace, destination-bucket with your bucket name.

aws s3api put-bucket-acl --bucket destination-bucket --acl bucket-owner-full-control

Note: By default is it disabled & it is a recommended setting in most cases you can just check and confirm it.

Step 4: In Your Source Account, Create Your DataSync Locations

Create DataSync locations for source and destination buckets.

Go to: https://console.aws.amazon.com/datasync/home?#/getStarted

Select Between AWS Storage services on the dropdown

Then, create a new source location and select it, then we will need to create a destination location from CloudShell and then select if afterward.

Create a New Location with the location type as Amazon S3, then select the region and bucket to migrate.

To avoid network-related errors it is recommended to create a location in the same region as a bucket is located.

Then, on the IAM role select the one we created in 1st step.

Now, we have to create and select a destination location which is not possible using the DataSync console interface due to the need for cross-account locations. Therefore we use CloudShell to do so and select it from the console interface.

aws datasync create-location-s3 \
  --s3-bucket-arn arn:aws:s3:::destination-bucket \
  --region destination-bucket-region
  --s3-config '{
    "BucketAccessRoleArn":"arn:aws:iam::source-account-id:role/source-datasync-role"
  }'

Again, replace the following with respective credentials:

destination-bucket -> destination bucket name

source-account -> Source account id , which you can find hovering at the profile in the top right corner.

source-datasync-role -> role-name we created in Step 1

On Success, we will have results similar to those below and our location will be created:

{
  "LocationArn": "arn:aws:datasync:us-east-2:123456789012:location/loc-abcdef01234567890"
}

And continue on the console interface by selecting the newly created location

Step 5: In Your Source Account, Create and Start Your DataSync Task

Give your Task a name and in most cases default configuration must be fine, you can twerk some settings as per your need.

You can optionally choose to run it in a certain time interval in case you are storing the same data in multiple places and want to keep it synced, but for one-time use cases leave it as it is.

I also recommend autogenerating the CloudWatch group so, that we can monitor the status of our task.

Review the config and then create a task.

In Some cases, you can get an error saying permission denied or related. That can happen if your bucket data is encrypted. In such cases, we need to provide additional permission to our IAM role used here.

In such cases to to role we created, then on its permission add an inline policy, and select S3 & KMS permissions which are essential for accessing & reading the encrypted bucket objects.

Here, if you doesn’t want to share all permissions the following permissions are enough:


{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket",
                "kms:Decrypt",
                "kms:GenerateDataKey"
            ],
            "Resource": [
                "arn:aws:s3:::source-bucket-name",
                "arn:aws:s3:::source-bucket-name/*"
            ]
        }
    ]
}

Now, you should be able to create a task and start a task to transfer your data.

The cost for this service as of now ( July 2024) is $0.0125 per gigabyte (GB) for most regions. It is subject to change per region and in the future so, for the latest prices visit the following link:

Online Data Transfer and Migration-AWS DataSync-Amazon Web Services

Clarify billing questions with AWS DataSync pricing examples and AWS calculator. DataSync customers pay only for data…

aws.amazon.com

Now, you will be able to start the task for data transfer and monitor its log through CloudWatch.

That’s all now start your task and log it, you will be able to watch its live trails, and after a successful transfer, it return the message, finished with the status Success.

It can take some time as per your data size and previously chosen network bandwidth limitations.

Congrats 🎉, now you can access all your S3 data from your destination account.