GCP Data Transfer Options

Nilesh Khandalkar
CodeX
Published in
4 min readAug 25, 2021

--

This article talks about various transfer option offerings by Google Cloud, there are various options given by Google Cloud to efficiently transfer the data. The data transfer can be from on-prem data to cloud or between cloud to cloud i.e. GCP to GCP or any other cloud to GCP.

Below are the options available for Data Transfer:

Data Transfer options in GCP

Transfer Service (Cloud): This enables data transfer for hybrid or multicloud strategy, it can move data from private data centers, AWS, Azure or GCP itself. It can transfer petabytes of data other clouds over online networks — billions of files and 10s of Gbps. It can optimize your network bandwidth and accelerate transfers with scale-out performance.

Perform transfers as needed to enable a multicloud strategy, or periodically move data (on schedule basis) as part of a data-processing pipeline or analytical workflow.

Transfer Service (On-Prem): It can transfer petabytes of data from on-premises sources to Cloud Storage over online networks. With simplicity and security built in, this service scales to available bandwidth and can deliver seamless transfers in just minutes. Perform one-time transfers as needed, or set up recurring transfers for analytics, backup, or archival purposes in a scheduled manner.

Transfer Service allows to monitor and log the progress of transfer jobs in very efficient way, Storage Transfer Service could be configured to deliver Pub/Sub notification on transfer completion. It maintains Data Integrity making sure that there is no loss of data during the transfer. It follows a very robust security and encryption mechanism to securely transfer the data. You can schedule one-time transfer operations or recurring transfer operations as per the requirement. It supports incremental transfer to minimize the amount of data that needs to be sent, use bandwidth effectively, and ensure transfers run quickly.

BigQuery Data Transfer

BigQuery Data Transfer — There is an option to transfer data directly to BigQuery by using the BigQuery Data Transfer service. This service automates data movement into BigQuery on a scheduled, managed basis. After you configure a data transfer, the BigQuery Data Transfer Service automatically loads data into BigQuery on a regular basis. The BigQuery Data Transfer Service supports loading data from the various data sources like Cloud Storage, Google Ads, Campaign Manager, Youtube channel reports and few more. Additionally it can also be used to transfer data between two BigQuery datasets within the same region or different region. This is a use case which implemented in our project to move the data from US region to EU region for building the data pipelines in the EU region.

You can schedule the transfers and also get a email notification incase of any failures.

gsutil — The gsutil cp command allows you to copy data between your local file system or on-prem and the cloud, within the cloud, and between cloud storage providers. For example, to upload all text files from the local directory to a bucket. It uses multithreaded option or parallel composite upload which breaks the object into chunks and then uploads to GCS, assembles it back to the original file. The gsutil can be scheduled using cron tab or any available scheduler.

Transfer Appliance — This is basically offline one way transfer, like from Data Center to GCP. As this involves lot of cost, this can be used for one time data transfer for huge amount of data (over 10TB), and then on-going data can be transferred using gsutil. This service is only used for on-prem to GCP and not to AWS or Azure.

This used a high-capacity storage device that enables you to transfer and securely ship your data to a Google upload facility, where it is then uploaded to Cloud Storage.

Hope this helps!

--

--

Nilesh Khandalkar
CodeX
Writer for

Passionate about Data and Cloud, working as Data Engineering Manager at Capgemini UK. GCP Professional Data Engineering Certified Airflow Fundamentals Certified