Google APIs from Windows DNS Server
Background
Google Cloud customers often need to route particular Google-bound traffic like storage.googleapis.com from their on-premises network to their private Google Cloud Platform (GCP) connectivity — VPN or interconnect — rather than over the Internet, for efficiency, security, or compliance reasons. For example, this is necessary for cases like when a VPC SC perimeter surrounds Cloud Storage buckets since external requests are blocked, allowing only those from within the perimeter. If the on-premises network is part of the VPC SC perimeter, client devices there will need to forward their requests through the private connectivity to GCP to access the bucket.
In certain situations, redirecting all *.googleapis.com traffic from the on-premises network to GCP may not be feasible, and a more granular approach is required. This is particularly relevant when only a specific subset of clients should be directed to GCP projects within the company, while other clients accessing general Google APIs such as Search, Maps, or even Cloud Storage should retain access to public internet endpoints, with all clients sharing a single DNS environment. This article outlines how to configure GCP and Windows DNS Server to achieve this selective redirection.
Solution
The solution presented will handle a more complex scenario where various on-premises clients are linked to different GCP environments, production and development, in addition to regular clients accessing Google APIs through public endpoints. The solution assumes that clients can be identified based on their IP subnets, primarily those connecting to GCP.
Due to the need to consider both DNS records and client-specific fields when responding to DNS queries, modifications to DNS results are necessary. This can be achieved using DNS Response Policies (RPZ), or in our case DNS Policies that Windows Server 2016 introduced to provide similar capabilities to RPZ. DNS Policies provide enhanced flexibility and capabilities, such as Geo-location Traffic Management (ensuring clients receive the IP address of the geographically closest resource) and Split Brain DNS (tailoring responses based on the client’s network location).
GCP-only clients
To ensure that all traffic to Google APIs from the on-premises network is forwarded to GCP, one solution is for the on-premises network to become authoritative for the googleapis.com zone. With Windows DNS Policies, you can create your own records for your authoritative zones and implement the capabilities mentioned earlier through new objects:
- Zone Scope: a zone comprises one default zone and can have various zone scopes. Each scope may contain an identical set of DNS records, but with different IP addresses assigned to each scope.
- Query Resolution Policy: a policy applied to zone scopes that establishes procedures for resolving DNS queries using predefined criteria. These criteria could include the client’s subnet or the type of record being queried.
- Client Subnet: an object representing subnets corresponding to clients. You can utilize client subnet objects in query resolution policies to match incoming queries to specific clients.
To access Google APIs supported by VPC Service Controls, you must set up either Private Google Access (PGA) for on-premises or Private Service Connect (PSC). With PGA, you redirect traffic to the special domain names private.googleapis.com or restricted.googleapis.com. The restricted range, 199.36.153.4/30, allows access to APIs and services that are supported by VPC Service Controls and blocks access to those services that are not supported. We will use this range in our configuration.
Within the googleapis.com zone scope, the on-premises DNS server should set up a CNAME record from *.googleapis.com to restricted.googleapis.com. Additionally, an A record should be created, pointing to the IP addresses within the restricted range. Further steps, such as configuring routing from the on-premises network to the restricted range in GCP, are necessary. Refer to the official documentation for more details.
In our final approach, we plan to use the newly introduced DNS Policy objects. However, this configuration, in its current form, will not work for regular clients that do not require to be forwarded to GCP. Any client not recognized as being in a particular subnet that requires redirection won’t match any resolution policy and will continue with the usual resolution procedure. As a result, they’ll get a reply with the records from the regular authoritative googleapis.com zone. Of course companies can’t be authoritative for Google APIs and their public endpoints.
GCP and non-GCP clients
In situations where both GCP and non-GCP or regular clients reside within the on-premises network, it is not feasible for the on-premises infrastructure to serve as the authoritative source for the googleapis.com zone. As a result, any queries made by these clients need to be forwarded to external resolvers. However, it’s important to note that using a resolver for Google APIs will only provide access to public endpoints.
Fortunately, Windows DNS Policies provide an object that we can use to resolve this issue:
- Recursion Scope: similar to zone scopes, DNS recursion can have a default behavior as well as scopes where recursion or resolution behaves differently by directing to a unique set of forwarders.
By configuring query resolution policies to selectively choose recursion scopes based on criteria such as the client subnet, it becomes possible to resolve the same domains to different IP addresses. This approach provides a flexible solution for managing DNS resolution in complex network environments.
High-level diagram
The DNS process we’ll implement is outlined in the high-level diagram below. Three types of on-premises clients are used:
- Non-GCP or regular clients, which should continue to access Google APIs and the Internet as usual. They should not be affected by any modifications made to the DNS server.
- GCP clients in a production environment, where Google APIs resolution should be forwarded to Cloud DNS in a production VPC typically through an Interconnect connection.
- GCP clients in a development environment, where Google APIs resolution should be forwarded to Cloud DNS in a development VPC, usually through an Interconnect or VPN connection.
Each client sends a DNS query for a given domain, such as storage.googleapis.com. Depending on the type of client, the response to the query may vary because different clients use different resolvers.
Cloud DNS will act as the resolver for GCP clients. We will configure private zones to CNAME *googleapis.com to restricted.googleapis.com with A records from the 199.36.153.4/30 range. To accommodate both the production and development environments, the restricted range will be split into two, with each environment receiving two IPs. This setup allows traffic routing from the on-premises network to the appropriate physical connectivity. Additionally, inbound server policies will be created in the production and development VPC networks, and the inbound forwarder IPs will be identified. Refer to the official Google Cloud documentation for instructions on configuring the GCP side.
As shown in the picture, queries are matched based on the client subnet and the specified domain. If we didn’t include the domain of the request as a criterion, all GCP client queries for non-authoritative zones would be forwarded to Cloud DNS. While this setup could work, customers typically prefer to redirect only DNS GCP traffic to GCP.
Configuration of Windows DNS Server
Following is the configuration of the on-premises DNS server. It uses the DNS Policy objects described previously. To create these objects, run PowerShell as an administrator on the server.
- Create ClientSubnets objects in the DNS server for production and development clients, to identify DNS queries from each other and from the rest of clients. Let’s assume the range 10.100.0.0/24 is used for production and 10.200.0.0/24 for development. Multiple IP ranges can be specified if necessary. Notably, ClientSubnets objects do not automatically replicate across DNS servers. Therefore, if there are multiple servers, they must be explicitly added to each server.
> $dcs = (Get-ADDomainController -Filter *).Name
> foreach ($dc in $dcs) {
Add-DnsServerClientSubnet -Name "Prod" -IPv4Subnet "10.100.0.0/24" -ComputerName $dc
Add-DnsServerClientSubnet -Name "Dev" -IPv4Subnet "10.200.0.0/24" -ComputerName $dc
}
- Create recursion scopes for both production and development to send traffic to Cloud DNS. Utilize the inbound forwarder IP addresses (inbound server policies) specific to each environment. Let’s assume those forwarder IPs are 172.20.0.2 (prod) and 172.30.0.2 (dev).
> Add-DnsServerRecursionScope -Name "CloudDNSProd" -Forwarder "172.20.0.2" -EnableRecursion $true
> Add-DnsServerRecursionScope -Name "CloudDNSDev" -Forwarder "172.30.0.2" -EnableRecursion $true
- Now comes a not well-documented aspect. The initial query for googleapis.com from a client populates the global cache of the Windows DNS server, corrupting the cache for subsequent clients. To prevent this issue, it’s necessary to shard or partition the DNS server’s global cache before implementing any query resolution policy for the recursion scopes. The partitioning process is based on creating zone scopes, as the cache is treated as an additional zone.
Create zone scopes for the DNS server cache, which will serve as cache shards dedicated to clients meeting the query resolution policies that we will create later.
> Add-DnsServerZoneScope -ZoneName "..cache" -Name "CloudDNSProdCache"
> Add-DnsServerZoneScope -ZoneName "..cache" -Name "CloudDNSDevCache"
- Bind the cache zone scopes to query resolution policies for them to take effect. The policies will use FQDN and client subnet criteria to map to the zone scopes. As client subnets, query resolution policy objects do not replicate across DNS servers.
> $dcs = (Get-ADDomainController -Filter *).Name
> foreach ($dc in $dcs) {
Add-DnsServerQueryResolutionPolicy -Name "CloudDNSProdCache" -Fqdn "EQ,*.googleapis.com" -ClientSubnet "EQ,Prod" -Action ALLOW -ZoneScope "CloudDNSProdCache" -ZoneName "..cache" -ComputerName $dc
Add-DnsServerQueryResolutionPolicy -Name "CloudDNSDevCache" -Fqdn "EQ,*.googleapis.com" -ClientSubnet "EQ,Dev" -Action ALLOW -ZoneScope "CloudDNSDevCache" -ZoneName "..cache" -ComputerName $dc
}
- Create query resolution policies with the same criteria to map to the recursion scopes created previously.
> $dcs = (Get-ADDomainController -Filter *).Name
> foreach ($dc in $dcs) {
Add-DnsServerQueryResolutionPolicy -Name "CloudDNSProd" -Fqdn "EQ,*.googleapis.com" -ClientSubnet "EQ,Prod" -Action ALLOW -ApplyOnRecursion -RecursionScope "CloudDNSProd" -ComputerName $dc
Add-DnsServerQueryResolutionPolicy -Name "CloudDNSDev" -Fqdn "EQ,*.googleapis.com" -ClientSubnet "EQ,Dev" -Action ALLOW -ApplyOnRecursion -RecursionScope "CloudDNSDev" -ComputerName $dc
}
- To troubleshoot DNS issues, consider taking these additional steps:
- Clear the DNS caches. This may temporarily affect DNS performance, but it can resolve some issues.
- Inspect the caches to identify any irregularities or errors.
> Clear-DnsServerCache
> Clear-DnsServerCache -CacheScope "CloudDNSProdCache"
> Clear-DnsServerCache -CacheScope "CloudDNSDevCache"
> Show-DnsServerCache
> Show-DnsServerCache -CacheScope "CloudDNSProdCache"
> Show-DnsServerCache -CacheScope "CloudDNSDevCache"
After applying these changes, different types of on-premises clients should now resolve *.googleapis.com to different IP addresses.
Final notes
- The shell commands show how to configure redirection for *.googleapis.com. However, this is usually not the only domain required for GCP services to work from on-premises networks. Domains such as accounts.google.com or gcr.io are also typically needed. You would need to add corresponding Add-DnsServerQueryResolutionPolicy commands with those FQDNs. A list of possible domains can be found in the documentation.
- Although it is possible to redirect only some services like Google Cloud Storage or BigQuery specifying domains like storage.googleapis.com, using *.googleapis.com makes the DNS configuration simpler and you can still control access to GCP services based on IAM roles (as you should).
- There is no GUI to configure DNS Policies, only PowerShell. However, if you already automate your IT processes with PowerShell this approach can become an advantage, as it allows for easier integration and management of DNS settings.