Data Migration to GCP : Redshift Migration to GCP BigQuery

Published in

Petabytz

4 min readJun 19, 2019

Proof of Concept
We started the process with a POC in which we considered existing running infra compatibilities with services offered by the Google Cloud Platform and also planned for elements in our future roadmap.

Key areas covered in POC:

· Redshift

· BigQuery

· Redshift to BigQuery

Redshift :

Welcome to the Amazon Redshift Cluster Management Guide. Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. This enables you to use your data to acquire new insights for your business and customers.

BigQuery:

Storing and querying massive datasets can be time consuming and expensive without the right hardware and infrastructure. Google BigQuery is an enterprise data warehouse that solves this problem by enabling super-fast SQL queries using the processing power of Google’s infrastructure. Simply move your data into BigQuery and let us handle the hard work. You can control access to both the project and your data based on your business needs, such as giving others the ability to view or query your data.

You can access BigQuery by using the GCP Console or the classic web UI, by using a command-line tool, or by making calls to the BigQuery REST API using a variety of client libraries such as Java, .NET, or Python. There are also a variety of third-party tools that you can use to interact with BigQuery, such as visualizing the data or loading the data.

BigQuery is fully-managed. To get started, you don’t need to deploy any resources, such as disks and virtual machines. Get started now by running a web query or using the command-line tool.

What are the difficulties in Redshift?

· Requires staffing to manufacture, keep up and direct a bunch

· Extensive downtime required to increase capacity

· Capacity and figure assets are fixed to explicit server designs

· Security and review highlights must be empowered and oversaw

· Requires specific aptitudes to streamline execution

· Cost caused notwithstanding when bunch isn’t used

· Upkeep exercises effect group assets and execution

· High accessibility and DR are not inherent, requiring extra expense and assets

· No unmistakable way to empower client self administration

· No built-in support for advanced analytics and machine learning

· Moderate pace of innovation advancement

· Supports an old SQL standard (SQL-92)

How moving to BigQuery unravels these difficulties?

· Petabyte scale Data Warehouse with no establishment or set up

· Elimination of traditional ETL processes and tools

· Concentrate on logical use cases to drive an incentive without increment in framework cost

· High Availability and Disaster Recovery worked in

· Instant & elastic scaling of capacity as per demand+

· Pay-as-you-go — Pay only for the storage used and queries run

· Easy adoption of new tools, features and functions as they become available

· Support for real-time feeds and analytics

· No Limit on users and data size

· Support for a higher level of ANSI compliant SQL (SQL-2011)

· Query performance across the spectrum of workloads

· Direct Integration with most popular BI tools

How we can migrate from Redshift to BigQuery?

· Extract — Quick paced evaluation to profound plunge into the present state information stockroom, recognize the entrance examples and relocation needs and convey precise movement degree, course of events and cost

· Transfer — Automated SQL translation to GCP native technologies along with the required optimizations

· Load — A setup based information ingestion structure that supports ongoing and group mode with zero coding exertion

· Visualize — Computerized information approval for enormous measure of information at cell level utilizing machine learning and hashing calculations

How to get started?

Appraisal through Eagle utilizing SQL logs and Metadata separated from Redshift. AI to recognize the entrance designs, all framework exercises and relics expected to design the relocation. This commitment/action will likewise result in :

· Migration Strategy

· Migration Plan

· Efforts Estimation

· Solution Architecture

· Technical Architecture

Data Migration to GCP : Redshift Migration to GCP BigQuery

Written by Samadhan Kadam