Troubleshooting Reachability with Network Intelligence Center Connectivity Test
Troubleshooting potential network issues is challenging — and every minute counts. The network troubleshooting process can be distilled into 2 core challenges:
- Majority of the time goes into identifying the root cause of the issue
A solution that can help in pointing out “where” is the issue can be very helpful in reducing the MTTR. A source to destination path may lookup into a variety of network elements like routing table, firewall, NAT gateway etc and hence a tool which points ‘where’ the issue is helpful in troubleshooting networking issues .
2. Network changes sometimes lead to the introduction of new issues . Changing configurations can have unanticipated effects, so it’s always prudent to verify intent after deploying new changes
A solution that can help in ensuring that existing network connectivities are not broken because of new changes committed will be useful in this situation. This tool will give more confidence to the team committing a change in network and will also ensure the service owners that their existing services had no impact due to someone committing a change in network.
When it comes to troubleshooting networking connectivity in GCP, Network Intelligence Centre’s ‘Connectivity Test’ is the most comprehensive solution for the needs of a networking engineer. Connectivity Tests is a diagnostics tool that lets you check connectivity between network endpoints. It analyzes your configuration and, in some cases, performs live data plane analysis between the endpoints. An endpoint is a source or destination of network traffic, such as a VM, Google Kubernetes Engine (GKE) cluster, load balancer forwarding rule, or an IP address on the internet.
The purpose of this blog is to highlight 5 scenarios where a Connectivity test can help troubleshoot a given scenario, which otherwise would have taken much more time and effort from the person dealing with the situation. Please note that the below is not an exhaustive list of all possible scenarios where connectivity tests can be performed; but is an indication of some common scenarios which are encountered in daily tasks of most end customers of GCP.
Use Case 1 : Source and destination as VMs in different GCP projects
TASK:
The networking design team is designing a HUB to SPOKE setup with HUB as ‘common-services’ projects and SPOKE as multiple workload projects . During design phase they agreed to have following connectivity :
During the implementation phase , they need to ensure that the traffic path from a VM in service project to a VM in common service project is as per the design which they agreed .
SOLUTION
The connectivity test under ‘Network Intelligence center’ helps to check connectivity between 2 VMs in same / different project
DETAILS
Let’s take an example shown in Image 1 above , where source and destination VMs are in multiple projects and the VPCs are connected via options like VPC peering and VPN.
In order to visualize the traffic path from source to destination, the user created and ran a connectivity test under ‘Network Intelligence Centre ‘ and configured
Protocol = ICMP
Source = VM in project ‘common services’
Destination = VM in project ‘service project’
This gave him following results :
Looking at the results above, a user can make out
- Since path is end to end GREEN , it means the source to destination is reachable
- A user can see various hops which packet has gone through
- Within a hop, a user can expand and see details. For example, if a packet passed through the VPC firewall, a user can see which exact rule the packet was inspected against.
- The user can make an informed decision if the traffic is taking the same path as dictated by their design . If not , the results of the connectivity test helps to find where the deviation is.
Additional advantage of connectivity test will be help in troubleshooting in case the connectivity is broken .If this connectivity is not established , the administrator may be required to debug into various areas like
- Look into source side firewall rules
- Look into routing table of source VPC
- Look into VPN config
- Ensure that Host project VPC has required route
- Look into firewall of destination project
The results of the connectivity test helps the network engineer to find where the issue is.
Use Case 2 : Source VM in GCP and destination in on-premise
TASK
The cloud team in an organization is troubleshooting the connectivity between GCP and on-premise workloads and also between GCP and alternate cloud workloads reachable via VPN. They are not sure if the issues are on GCP side or on remote end (on-premise / alternate cloud provider) . They want a way to find out if GCP side of configuration is correct before reaching to on-premise networking team / other Cloud provider connectivity team
SOLUTION
The Connectivity test under Network Connectivity Center helps to check connectivity from Google Cloud to and from on-premises networks . Under the category of ‘hybrid connectivity ‘ the connectivity test supports Cloud VPN and Cloud Interconnect , which will be useful in this particular case.
DETAILS
In the example taken here , the customer has a VM in GCP which is attempting to connect to a web server hosted on-premises . The connectivity solution used between GCP and on-premise is VPN in this example
The major touchpoints in case VPN connectivity between on-premise to GCP is broken include
- Tunnel configuration
- BGP route advertisement and ensuring that required remote routes have made it to routing table
- Firewall configuration
With the results shown by connectivity tests , it becomes very clear that which path is taken by GCP to on-premise web traffic and if connectivity is dropped , it will also help to point out where the connectivity is dropped.
Use Case 3 : Source VM in GCP and destination as internet endpoint
TASK
A networking and security team of a company wants to exercise control on who is allowed to talk to the Internet . After deployment phase , they found some of GCP servers were not able to reach to internet and wish to see where is the connectivity broken
SOLUTION
The connectivity tests under GCP Network Intelligence center have “VM instances to and from the internet” as the supported configuration . The customer team can configure the connectivity test to perform troubleshooting on Internet bound traffic
DETAILS
The reasons for dropped connectivity from GCP instance to Internet location could range from routing issues, NAT issues, firewall issues etc.
For the purpose of demonstration, let’s take a couple scenarios
- Ping from a VM with public IP address in GCP to destination 8.8.8.8
- Ping from a VM with private IP address in GCP to destination 1.1.1.1
In the first test, after configuring the connectivity test , the result was shown as DROPPED. The reason shown in the connectivity test revealed that the VPC routing table didn’t have the route towards the default internet gateway, which resulted in connectivity failure
Please Note : Although 8.8.8.8 is a publically available IP address , when an attempt is made to reach this destination from within GCP , the traffic is not sent to the Internet . The traffic remains on Google backbone . But from a connectivity standpoint , an internet facing route in the routing table is still needed .
As a second scenario in network troubleshooting , the user now attempted to check if ping to 1.1.1.1 from a source VM was successful or not . He has ensured that the route pointing to internet is present in the VPC routing table.
Lets see the results of the Connectivity Test now:
In this example, the result is a DROP because although the Internet facing route existed in the VPC routing table, required Cloud NAT configuration was not present to convert the private VM address to a public IP address so that packet can be routed to the Internet.
Hence this is a clear indication that the networking team needs a proper NAT configuration to enable this connectivity.
Use Case 4: Source as VM on GCP and destination as Cloud SQL
TASK
The application team needs connectivity between VMs in a GCP VPC to Cloud SQL . They read the GCP documentation here and attempted to setup the connectivity . But they aren’t sure if they have done it right way and needs a connectivity visualization of source to destination traffic path
SOLUTION
The Connectivity Tests configuration analysis can still run a test and provide an overall reachability result for Google-managed services
DETAILS
The steps from source VM to Cloud SQL include checking the relevant egress firewall rules and matching the route. When a customer runs this test, Connectivity Tests configuration analysis provides details about these steps.
However, for the final logical step of analyzing the configuration in the Google-owned VPC network, the analysis provides only an overall reachability result. Connectivity Tests does not provide details for the resources in the Google-owned project because you do not have permission to view them. The following diagram explains what is analyzed in this situation
Following example shows the results of a connectivity test created by customer from source VM to Cloud SQL.
Looking at the results the customer team can understand that when a user set up a ‘private services connection‘, the VPC actually creates a peering behind the scene. And this is shown clearly in the results of connectivity tests. Therefore in this example , connectivity test proved to be a great tool to customer to validate their understanding of network traffic flow
Use Case 5 : Cloud Run to VM in GCP
TASK
A GCP Customer’s application team has deployed an application in Cloud Run. In order to do a self-triage of network issues between Cloud Run and VMs hosted in GCP VPC , the application team members are looking for a native GCP tool
SOLUTION
Connectivity Test can help in verifying the connectivity from a Cloud Run to a VM instance OR to an IP address
DETAILS
Connectivity test gives an option to include serverless solution as source of packet as follows
In the example discussed here , a user had a deployment of Cloud run, and VPC connector was configured to enable communication of Cloud run to VPC Connectivity test for a communication from ‘CLOUD RUN’ to a VM sitting in a VPC on tcp port 80 gives following results.
The customer application team can use the connectivity tests to troubleshoot the issues of reachability from serverless functions like Cloud Run to GCP hosted VMs Or from Cloud Run to IP addresses reachable on the Internet .
Closing Notes:
Network Troubleshooting is a skill which is complimented by useful tools at your fingertips. When it comes to network troubleshooting on GCP platform, the native service of ‘Connectivity test’ proves to be a useful tool that helps connectivity in a more visual representation and helps quickly find out where the connectivity is broken. It has a great coverage that helps in troubleshooting connectivity issues related to internal VMs, Internet, on-premise private connectivity, connectivity to GCP services, serverless like Cloud Run, APP Engine etc.
Also connectivity tests come handy at each stage of your cloud journey. For example, during the migration phase, these may help you make an informed decision whether a network change has affected the existing working connectivity or not. In steady state operations, it may help network operations engineers to find what is broken and where and hence contributing towards keeping the MTTR under check.
Disclaimer: This is to inform readers that the views, thoughts, and opinions expressed in the text belong solely to the author, and not necessarily to the author’s employer, organization, committee or other group or individual.