An Agile Architecture for Analytics and AI on Google Cloud
Bet on low-code/no-code and serverless
An agile architecture is one that gives you:
- Speed of development. You should be able to go from idea to deployment as quickly as possible
- Flexibility to quickly implement new features. Sometimes speed comes at the expensive of flexibility — the architecture might shoehorn you into a very limited set of use cases. You don’t want that.
- Low-maintenance so that you are not spending your time managing infrastructure.
- Autoscaling and resiliency so that you are not spending your time monitoring infrastructure
What does such an architecture look like on Google Cloud when it comes to Data Analytics and AI? It will use low-code and no-code services (pre-built connectors, automatic replication, ELT, AutoML) so that you get speed of development. For flexibility, the architecture will allow you to drop down to developer-friendly, powerful code (Apache Beam, SQL, TensorFlow) whenever needed. These will run on serverless infrastructure (Pub/Sub, Dataflow, BigQuery, Vertex AI) so that you get low-maintenance, autoscaling, and resiliency.
No-code, low-code Analytics and AI Stack
When it comes to architecture, choose no-code over low-code and low-code over writing custom code. Rather than writing ETL pipelines to transform the data you need before you land it into BigQuery, use pre-built connectors (in Data Fusion, Datastream, Data Transfer Service, Dataflow Templates, FiveTran, etc.) to directly land the raw data into BigQuery. Then, transform the data into the form you need using SQL views directly in the data warehouse. You will be a lot more agile if you choose an ELT approach over an ETL approach.
Another place is when you choose your ML modeling framework. Don’t start with custom TensorFlow models. Start with AutoML. That’s no-code. You can invoke AutoML directly from BigQuery, avoiding the need to build complex data and ML pipelines. If necessary, move on to pre-built models from TensorFlow Hub, HuggingFace, etc. and pre-built containers on Vertex AI. That’s low-code. Build your own custom ML models only as a last resort.
Use Managed Services
You will want to be able to drop down to code if the low-code approach is too restrictive. Fortunately, the no-code architecture above is a subset of this full architecture that gives you all the flexibility you need:
When the use case warrants it, you will have the full flexibility of Apache Beam, SQL, and TensorFlow. This is critical — for use cases where the ELT+AutoML approach is too restrictive, you have the ability to drop to a ETL/Dataflow + Keras/Vertex approach.
Best of all, the architecture is unified, so you are not maintaining two stacks. Because the first architecture is a subset of the second, so you can accomplish both easy and hard use cases in a unified way.