By The Cloud Foundations Team at King
King’s journey to Google Cloud (GCP) began last year with the aim of taking advantage of the advanced tools and management services it offers. The migration of our infrastructure to GCP has enabled a huge increase in scaling ability as well as improvements in the development cycle for our products.
Securing our platform has been a top priority since we decided to move to Google Cloud. In this post, we will explain our thoughts on some of the proposed solutions for credentials rotation, describe our requirements and introduce the solution that was implemented.
“Our infrastructure needs to support hundreds of thousands of concurrent connections per second, as well as our data warehouse, and we saw that Google has the capacity to handle our needs. At the same time, we were very excited by its focus on machine learning and artificial intelligence.”
- Jacques Erasmus, CIO, King.
What are Service Accounts
Service accounts are unique identities that are used to facilitate programmatic access to GCP APIs. Each service account can have one or more keys which are used to authenticate with Google Cloud. Security best practices recommend keys to be periodically rotated which involves generating a new version of a key, distributing it amongst users and removing the old version of a key. This is done to limit the risk of service account keys being used by a malicious third-party user in the event of keys being leaked. As GCP APIs are publicly accessible, key rotation limits the impact of leaked keys being used by malicious third parties.
For our particular needs at King, the focal point was securing access to the Google Cloud API endpoints using service account keys. Even though key rotation is a simple concept, implementing it to ensure flexibility and user convenience while guaranteeing the security of the system was not a straightforward task. Solutions to overcoming the initial implementation issues are company specific. Our solution to this set of issues involved implementing a custom key rotation setup and distribution mechanism that provided our organisation the required functionality.
Vault & its Secrets Engines
The Vault is an open-source implementation of a store for such things as tokens, passwords, certificates, and API keys. Vault offers a wide range of secrets engines which store, generate, and/or encrypt data. This includes stores that simply save and retrieve data from storage and dynamic stores that interface with different cloud providers’ API endpoints. The Google Cloud Vault secrets engine can be used for the provision of Google Cloud service account keys along with OAuth tokens based on vault policies. This allows users access to Google Cloud resources without needing to create or manage a dedicated service account.
“Secure, store and tightly control access to tokens, passwords, certificates, encryption keys for protecting secrets and other sensitive data using a UI, CLI, or HTTP API.”
Vault was chosen as a service account storage for its security, provisioning of client authentication and architecture. This allows highly available deployment in multiple geographically isolated locations eliminating single points of failure.
Using Google Cloud services specifically, a solution based on Vault and custom key rotation services provides our users with an ability to individually manage service account properties while enforcing maximum lifetime of their keys and providing a consistent way of retrieving up-to-date keys.
Key Rotation: The Google Cloud Way
Key rotation is one of the basic rules of good security practices and Google provides some further ideas on key rotation as their documentation mentions the following method:
“A security best practice is to rotate your service account keys regularly. You can rotate a key by creating a new key, switching applications to use the new key and then deleting old key. Use the serviceAccount.keys.create() method and serviceAccount.keys.delete() method together to automate the rotation.”
While feasible, this method didn’t seem ideal to us, as key rotation could not be centrally enforced (client application is responsible for rotating its own keys) and/or due to possible privilege escalation risk.
Therefore a service account performing key rotation currently cannot be prevented from rotating another service account’s keys.
In other words, a service account performing rotation could assume the identity of another service account with different permissions in the project — creating a new key for a different service account.
Key Rotation: The Vault Way
Vault, in conjunction with its storage engines, supports dynamic secret provisioning whereby credentials are created and passed over to the requesting party. The Vault server in this scenario authenticates against GCP and makes API requests to the serviceAccount and serviceAccountKey API endpoints.
In the case of using the Google Cloud engine, there are some important considerations to note: Vault will by default only create up to 10 keys for each newly created service account (a Google Cloud restriction). Furthermore, Vault policy mapping which defines service account roles needs to be specified. Mapping is ideally done dynamically by referencing the organisation’s own IAM database. This can be an issue when: no mapping source is present, complex cross-project service account permissions are required or a different set of permissions in Google Cloud and in the corporate IAM database are required.
The key rotation service consists of enrolment and operational phases. Projects that wish to use the key rotation service can enrol by setting a GCP project label. A project is enrolled into the rotation service when the Vault access token with the corresponding Vault policy is generated and sent to the project owner. In the operational phase, the owner of the project can then use the Vault token to retrieve the GCP service account key generated daily.
Key Rotation Enrolment Process
In our case, with Vault serving as a store of Google Cloud service account keys, we were faced with a challenge of authorising user access to those credentials. Before a client can retrieve a secret value from Vault, it needs to authenticate with it. In our case, a dedicated project owner role is assigned based on the project’s IAM roles. This responsible user is provided with Vault authentication tokens then used to access the most recent up-to-date key for the project.
The way this was eventually implemented for the key rotation service was to compile a list of all projects in the organization that needed to be enrolled (enrolment into the key rotation service is indicated by setting a custom project label) with the corresponding project owner. After information about the project and its owner have been collected, a new Vault access token and corresponding Vault policy are created and provided to the project owner. From that moment on, the project owner owns an initial set of credentials allowing him/her to access the path in Vault containing the project’s service account keys.
Key Rotation Operational Process
After project enrolment into the key rotation service, a user can retrieve service account keys by authenticating with Vault and reading the project’s path in Vault. The key rotation service in this mode is responsible for creating new service account keys and storing them in Vault for a client’s consumption. In line with our corporate policy, a list of service accounts with service account keys older than 24 hours is compiled, and for those new keys are created and stored in Vault. During this phase, a scan of all enrolled projects is performed, removing any keys older than 24 hours.
The Cleaner service is responsible for maintaining the consistency of secrets data stored in Vault and the state of Google Cloud IAM. Cases when cleanup might be required include a project being deleted (the Vault user token path is removed), a project being unenrolled from the key rotation service (the project label is removed), a service account, or all keys for a service account, being removed (the service account key and Vault path are removed).
Combining all above-described elements of the key rotation process allows us to: centrally manage and enforce GCP service account key TTL and provide users with a convenient way of retrieving up-to-date service account keys.
As already mentioned, while key rotation is a simple concept, its practical implementation can be complex and require custom logic. Various ways of rotating GCP service account keys exist and implementation should consider aspects like user experience and flexibility, the ability to centrally manage and enforce key rotation/expiration, and the cost of maintaining the chosen solution. Each individual solution to the key rotation process will come with its own set of pros and cons, and as always, the best solution for an organization will depend on its requirements.
Going beyond service account keys, when designing a secure platform on Google Cloud, it’s worth mentioning other ways of achieving high levels of security while maintaining user convenience. If an application using a service account is hosted on GCP, an option of assigning a service account to resources (Compute Engine, GKE, etc) exists, avoiding the issue of having to manage service account keys completely. This is an appealing option, as no rotation or other maintenance of the service account keys is required.
Last but not least, Google Cloud regularly introduces new and exciting improvements for existing services, this is something worth keeping an eye on for improving your organisation’s security posture. Here at King, we always try to encourage Google Cloud on ways to make its portfolio better by giving regular feedback on their services.
Related to contents of this article and security in general, we would like to see more granular permissions management being implemented across more services, particularly in the case of service account keys creation and deletion, where a service account having granted iam.serviceAccountKeys.create and iam.serviceAccountKeys.delete permissions, can delete and create a service account key for other potentially higher privileged roles in a project.
We are continually developing and expanding our tech infrastructure thanks to GCP as well as reaching new flexibility in scaling and development cycles in a secured environment.