An Agile Architecture for Analytics and AI on Google Cloud

Bet on low-code/no-code and serverless

An agile architecture is one that gives you:

  • Speed of development. You should be able to go from idea to deployment as quickly as possible
  • Flexibility to quickly implement new features. Sometimes speed comes at the expensive of flexibility — the architecture might shoehorn you into a very limited set of use cases. You don’t want that.
  • Low-maintenance so that you are not spending your time managing infrastructure.
  • Autoscaling and resiliency so that you are not spending your time monitoring infrastructure

What does such an architecture look like on Google Cloud when it comes to Data Analytics and AI? It will use low-code and no-code services (pre-built connectors, automatic replication, ELT, AutoML) so that you get speed of development. For flexibility, the architecture will allow you to drop down to developer-friendly, powerful code (Apache Beam, SQL, TensorFlow) whenever needed. These will run on serverless infrastructure (Pub/Sub, Dataflow, BigQuery, Vertex AI) so that you get low-maintenance, autoscaling, and resiliency.

No-code, low-code Analytics and AI Stack

When it comes to architecture, choose no-code over low-code and low-code over writing custom code. Rather than writing ETL pipelines to transform the data you need before you land it into BigQuery, use pre-built connectors (in Data Fusion, Datastream, Data Transfer Service, Dataflow Templates, FiveTran, etc.) to directly land the raw data into BigQuery. Then, transform the data into the form you need using SQL views directly in the data warehouse. You will be a lot more agile if you choose an ELT approach over an ETL approach.

Another place is when you choose your ML modeling framework. Don’t start with custom TensorFlow models. Start with AutoML. That’s no-code. You can invoke AutoML directly from BigQuery, avoiding the need to build complex data and ML pipelines. If necessary, move on to pre-built models from TensorFlow Hub, HuggingFace, etc. and pre-built containers on Vertex AI. That’s low-code. Build your own custom ML models only as a last resort.

Use Managed Services

You will want to be able to drop down to code if the low-code approach is too restrictive. Fortunately, the no-code architecture above is a subset of this full architecture that gives you all the flexibility you need:

When the use case warrants it, you will have the full flexibility of Apache Beam, SQL, and TensorFlow. This is critical — for use cases where the ELT+AutoML approach is too restrictive, you have the ability to drop to a ETL/Dataflow + Keras/Vertex approach.

Best of all, the architecture is unified, so you are not maintaining two stacks. Because the first architecture is a subset of the second, so you can accomplish both easy and hard use cases in a unified way.

Enjoy!

--

--

--

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Recommended from Medium

Patching an NPM dependency without going completely insane

What is a STEM position ?

Spring Validation With Relational Databases

Software Engineering Good Practices

Data structure and Algorithm-Prefix Sum Array

We have read him books about being empathic, being kind, not bullying etc.! we

Porting Dual-Pivot Sort and Timsort from Java to Go

An overview of the different client on-boarding mechanisms used in Open Banking

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Lak Lakshmanan

Lak Lakshmanan

Operating Executive at a technology investment firm; articles are personal observations and not investment advice.

More from Medium

Reduce your BigQuery bills with BI Engine capacity orchestration

Data Workflow Modernization

Streaming Data to BigQuery with Dataflow and Updating the Schema in Real-Time

GCP: How To Sync Cloud SQL with BigQuery

https://cloud.google.com/bigquery#section-9