Seven Key Configs to Look at for a Secure AI Foundation

Published in

Google Cloud - Community

8 min readJul 26, 2023

A Secure Foundation for your AI Platform

Generative AI is a hot and emerging technology that everyone is talking about right now. If you haven’t seen the latest news, Google released their Secure AI Framework (SAIF), which covers some security best practices for AI. You can read more about Google’s SAIF here. In this blog, I’ll be covering the first callout of the SAIF which delves deeper into having a strong security foundation for your AI ecosystem.

But first, the Google Secure AI Framework

Now, I won’t be going over the entirety of the SAIF but in short, Google calls out these six points for collaboratively securing AI technology.

Expand strong security foundations to the AI ecosystem
Extend detection and response to bring AI into an organization’s threat universe
Automate defenses to keep pace with existing and new threats
Harmonize platform level controls to ensure consistent security across the organization
Adapt controls to adjust mitigations and create faster feedback loops for AI deployment
Contextualize AI system risks in surrounding business processes

So what’s covered in this writeup?

In this particular blog, we’ll talk more about the first point — “Expand strong security foundations to the AI ecosystem”. What does this mean exactly? Well, a part of it is securing the infrastructure and platform that your AI systems are running on. This touches on domains such as identity and access management (IAM) controls, data security, network security, and logging and monitoring. Google has services to support AI development, but I want to dive deeper into securing one service in particular — Vertex AI.

First, what is Vertex AI?

Google describes Vertex AI as “a machine learning (ML) platform that lets you train and deploy ML models and AI applications, and customize large language models (LLMs) for use in your AI-powered applications. Vertex AI combines data engineering, data science, and ML engineering workflows, enabling your teams to collaborate using a common toolset and scale your applications using the benefits of Google Cloud.” Vertex AI has some security built in thanks to Google Cloud’s default security settings such as default encryption at rest using Google managed encryption keys. However, just like any cloud technology, we can work towards adding additional configurations to ensure security for your Vertex AI platform.

So… how do we add additional security to Vertex AI?

Identity and Access Management

Let’s start with IAM controls. In Google Cloud, IAM is handled through IAM roles and permissions. Roles are broken down into three different types: Predefined, Basic, and Custom. Google Cloud gives you predefined roles with permissions specific to a persona. As of this writing, there are twelve predefined roles. Three roles that I think would be more commonly used are Vertex AI Administrator, Vertex AI User, and Vertex AI Viewer. You can find the list of up to date predefined roles here.

IAM is crucial to securing your environment. Why? Because you define who has access and what they can do with the information once the service has been accessed. In the world where we follow security best practices, there is a concept of principle of least privilege or PLP. PLP ensures that a user has access to only things that they need access to based on their job function or role. For example, a database administrator would most likely need to have full access to the databases, including the data, in the organization. On the other hand, someone from the security organization may only need access to see the database configurations such as the firewall settings. The person from the security organization shouldn’t need access to the data in the databases since it’s not part of their job function. PLP helps separate job duties and functions to ensure data doesn’t get into the wrong hands.

Data Security

The next area is data security. More specifically — encryption. By default, Google Cloud automatically encrypts data when it is at rest using encryption keys managed by Google. However, if there are specific compliance or regulatory requirements related to encryption keys that protect your data, you are able to use customer-managed encryption keys (CMEK).

Encryption is important as it helps to ensure that even if data were to get into the hands of a malicious user, the data could not be read without the encryption keys. In the case of CMEKs, it’s more useful with regards to ensuring full control of the encryption keys and not having to be dependent on Google’s managed keys. You can read more about what services in Vertex AI are compatible with CMEK here.

Network Security

Now let’s take a trip into network security. One control you should look at is VPC Service Controls (VPC-SC). What are VPC-SCs? VPC-SCs is a feature in Google Cloud that acts as a layer of perimeter security. VPC-SCs consist of an access policy and a service perimeter. An access policy will explicitly include what should be allowed be it a source IP range, caller identity, or even caller device attributes e.g., Chrome OS. You can read more about access level designs here. Now, service perimeters are basically groups that leverage your access policies. Think of it this way, you have one service perimeter that you want to enforce to allow your home IP range for. Now, you have another service perimeter that you want to enforce to only allow a VPN IP range for. You can explicitly have different service perimeters and attach different services to those perimeters to call out what services can be accessed by which source IP range. For more info on service perimeters, check this page out.

So, how can we leverage VPC-SCs for Vertex AI? Well, when we add the Vertex AI service to be included in a service perimeter, we’re able to control access and protect the following artifacts:

Training data for an AutoML model or custom model
Models that you created
Models that you searched by using Neural Architecture Search
Requests for online predictions
Results from a batch prediction request

If you want to know more about how you can enable VPC-SCs for Vertex AI, head over to this page.

Another network feature that you can leverage to have a more secure environment is enabling VPC network peering. VPC network peering connects two VPCs so that resources can communicate with each other directly. This allows packets to remain within Google’s production network. Keep in mind, regular network pricing applies with network peering.

So what are the benefits of using VPC network peering for Vertex AI? Your data from custom training jobs, private prediction endpoints, vector matching online queries, and more will go through the network peering and won’t leave out to the internet. For more info on how you can turn on VPC network peering for Vertex AI, check this page out.

Logging and Monitoring

Lastly, let’s touch on logging and monitoring. The first area in logging and monitoring that I want to cover is Audit Logs. Cloud Audit Logs is a feature in Google Cloud where services write logs that record administrative activities and accesses within Google Cloud resources. This is important because it will be able to tell you who did what, where, and when. For Vertex AI, Admin Activity audit logs, Data Access (ADMIN_READ) audit logs, Data Access (DATA_READ) audit logs, and Data Access (DATA_WRITE) audit logs are created. For a current list of what is covered by audit logs, you can find more information here.

Another feature around logging in Google Cloud is called Access Transparency. Access Transparency provides you with logs that capture actions made by Google personnel when accessing your data. Similar to Audit Logs, where logs are taken from actions made by members of your organization, Access Transparency provides logs made by Google. As of the writing of this post, Access Transparency supports the following Vertex AI services:

Vertex AI AutoML training
Vertex AI custom training
Vertex AI data labeling
Vertex AI Feature Store
Vertex AI Model Monitoring
Vertex AI Pipelines
Vertex AI prediction
Vertex AI TensorBoard
Vertex Explainable AI
Vertex ML Metadata

For the latest list of supported services and limitations for Access Transparency for Vertex AI, go here.

Finally, monitoring. Vertex AI exports metrics to Google Cloud Monitoring. Cloud Monitoring is pretty useful because it allows you to create dashboards and configure alerts based on the metrics that are sent. One specific example is getting alerted to high prediction latency or high CPU usage in Vertex AI. Why would someone want to have an alert like that, you might ask. Well that’s a great question! If you know that the system will have high CPU usage, then it shouldn’t matter. However, if having high CPU usage is out of the ordinary, then you might want to know what’s causing it. Maybe a user with malicious intent got hold of the environment and is now running code to do some crypto mining… that bill could get ugly, fast. To get a list of the latest supported Vertex AI metrics for Cloud Monitoring, go to this page. For more information on how you can enable and configure features for Cloud Monitoring, you can go here.

Conclusion

Vertex AI is one of the main services you can use for Generative AI solutions on Google Cloud. And just like with any cloud building block, the first step to building a secure AI environment is to secure your platform.

Covering the security domains I’ve called out are essential to securing your Vertex AI landscape in Google Cloud. To summarize, these 7 configurations can help you establish strong security foundations:

Identity & Access Management

Leveraging the correct IAM Roles & Groups

Data Security

Encrypting Data (Google Managed or if needed, Customer Managed)

Network Security

Turning on VPC Service Controls
Configuring VPC Network Peering

Logging and Monitoring

Enabling Cloud Audit Logs
Enabling Access Transparency
Enabling Cloud Monitoring

Interested in learning more about Generative AI? Check out Google’s free Generative AI learning path.

Want to learn more about Google Cloud AI or Security? You can speak 1:1 with a Google Cloud expert, including the author of this post, with the Innovators Plus subscription Save 80% on a bundle of Google Cloud developer benefits, such as cloud credits, an hour of consulting, access to over 700 hands-on labs, skill badges, and courses and more!

About the Author

Don is a Security Innovation Principal at Accenture focused on Application Security, Cloud Security, and DevSecOps. He leads Google Cloud Security Engineering and Offering Development for North America. He’s taken on roles with Fortune 100 firms to run cloud and application security assessments, while implementing security testing and controls and building secure cloud foundations. Interested in deploying this to your environment? Let’s connect!

Disclaimer

My postings reflect my own views and do not necessarily represent the views of my employer, Accenture, or Google Cloud.

The information in this blog post is general in nature and does not take into account the specific needs of your IT ecosystem and network, which may vary and require unique action. You should independently assess your specific needs in deciding to use any of the tools mentioned. The tools called in this blog are not Accenture tools. Accenture makes no representation that it has vetted or otherwise endorses these tools and Accenture disclaims any liability for their use, effectiveness or any disruption or loss arising from use of these tools