Proactive Network Monitoring with GCP Network Analyzer

Published in

Google Cloud - Community

6 min readDec 17, 2022

A proactive network health management program is essential to all cloud deployments and business processes. Despite being powerful and dynamic, cloud computing can sometimes feel complex to customers when they deploy suboptimal or error-prone network configurations unintentionally. For instance, organizations may implement changes that unknowingly introduce misconfigurations, violate best practices, exceed IP address utilization quotas, or allocate IP addresses inefficiently. In some cases, it may even result in a service outage. Having the service unavailable to the organization’s internal or external users could have a devastating impact on its business and reputation. Teams often resort to reactive workflows after a service interruption to troubleshoot and resolve such issues — manually running time-consuming diagnostics.

One way to implement proactive network health management is through the use of network monitoring tools. These tools can continuously monitor the network for issues such as high network usage, low bandwidth, and latency problems. They can also alert IT staff to potential problems so that they can be addressed before they cause disruptions.

Another important aspect of proactive network health management is the implementation of best practices for network configuration and management. This can include following guidelines for IP address allocation, ensuring that security protocols are in place and up-to-date, and regularly testing and updating network configurations to ensure optimal performance.

Proactive network health management can also involve implementing processes for ongoing network maintenance and optimization. This can include regular system updates, performance tuning, and the use of analytics tools to identify and address bottlenecks in the network.

Overall, proactive network health management is crucial for ensuring the stability and reliability of a cloud deployment and preventing costly service interruptions. By continuously monitoring and optimizing the network, organizations can minimize the risk of disruptions and maintain a high level of performance for their users.

The Network Analyzer module of the Network Intelligence Center on Google Cloud Platform (GCP) helps you proactively identify and fix network issues before they cause service disruptions. This feature, which became generally available at Google Cloud Next’22, includes a range of analyzers that can detect and predict potential network problems and present them as insights. With Network Analyzer, you can set up alerts to receive instant notifications about any misconfigurations in your network.

In this article, we will explore the different insights provided by Network Analyzer and how to configure alerts for them.

What kind of Insights can I get from Network Analyzer?

Network Analyzer insights are grouped into 5 different categories :

1. VPC network Insights : This covers common issues related to your VPC network, like -

Unused External IP addresses : You are charged for an external IP address which is reserved but not attached to any resource in your environment. When a reserved external IP address is unassigned for more than 24h, an alert is shown on the network analyzer.
IP Address Utilization : If IP address utilization for any of the subnet in the VPC, goes over 75% then its flagged.
Invalid VPC Routes : If the next hop of a VPC route is an invalid location or not configured properly then it is flagged on network analyzer. There could be multiple scenarios under this like next hop of a route is a GCE instance which is stopped or deleted or doesn’t have instance property canIPForward set to True, next hop of a route is a VPN tunnel which is down etc..

2. Network Services Insights : This covers issue with misconfigured load balancers like -

Load balancer health checks are not allowed or only partially allowed through firewall rules.
Load balancer backend service uses different ports for health check and traffic.

3. Kubernetes Engine Insights : Insights related to GKE are grouped in this category.

GKE node to control pane connectivity is misconfigured(VPC routes) or blocked (Firewall rules)
GKE control pane to node connectivity is misconfigured(VPC routes) or blocked (Firewall rules)
High IP utilization of allocated pod/service subnet range.
Access to Google APIs from private GKE cluster.

4. Hybrid Connectivity Insights : detects dynamic routes shadowed or partially shadowed by a subnet or static routes. The dynamic routes being shadowed could be a route learned from a cloud Router on the VPC network or it could be a route imported from VPC Peering.

5. Managed Services Insights : This insight group covers connectivity issues with Google managed services.

Connectivity to Cloud SQL instance is misconfigured(VPC routes) or blocked (Firewall rules)

GCP is adding more and more analyzers to this module and you may refer to the official documentation for the current list.

How often network insights are generated ?

Network Analyzer generates insights whenever relevant configuration changes are made, as well as periodically. Analyses are triggered approximately ten minutes after a related configuration change is made. Periodic analyses are performed at least once daily.

How to get real time notifications ?

You can view the insights in the cloud console on the network analyzer page. Network Analyzer is also exposed as an API, so you can consume those insights programatically and include them into your existing workflows or create new workflows on top of it.

The insights from Network Analyzer are also logged in cloud logging as platform logs. You can configure custom log based metrics on these logs and create alerts using Cloud Monitoring.

Let’s look at how to configure real time notifications. First lets create a logs-based metric for Network analyzer insights which are prioritized as Critical or High and which are of Type Error.

gcloud logging metrics create criticalHighNetworkIssue \
      --description "Critical or High Impact insight from Network Analyzer" \
      --log-filter "LOG_ID("networkanalyzer.googleapis.com%2Fanalyzer_reports") AND
(jsonPayload.priority="CRITICAL" OR jsonPayload.priority="HIGH") AND 
jsonPayload.type = "ERROR""

Network Analyzer insights are prioritized as Critical, High, Medium and Low, depending on the Severity of the issue. More details here.
Insight Type could be info, warning or error. More details here.

Once a metric is created you can easily configure an alert policy to get notifications -

Creating Alerts from custom Network Analyzer logs-based metric

Go to Logs-based Metrics under Logging.
From the three dot menu for our custom metric, choose “Create alert from metric” option. It will open Create alerting policy window.
Select Time series aggregation as none and click on NEXT button.
On Configure alert trigger window, choose condition type as Threshold, Alert trigger as Any time series voilates, Threshold position as Above threshold and Threshold value as 0.
Give condition a name and click NEXT.
On Configure notifications and finalize alert window, toggle select use notification channel and select your configured notification channel from the drop down.
Give alert policy a name and click NEXT.
Review and click on Create Policy.

You can create a notification channel in advance for the tool (Email, Slack, pagerduty etc..) of your choice using this guide. I have used an email notification channel.

If you want to send notification to any external ticketing/notification system then you can also do that using Cloud Pub/Sub and Cloud Function in addition to Logging. Figure below shows a sample reference architecture for the same -

Sending notifications to external system

How to monitor multiple projects together ?

With organisations having complex network architecture with shared VPC, VPC peering, Hybrid connectivity with Interconnect and VPN, Private service connect etc. Monitoring your network as a whole is more important than monitoring individual projects.

As is the case with Monitoring, you can view the network analyzer insights for multiple projects by changing the metric scope. Create a single scoping project and add other projects as monitored projects to it. This will ensure that you have central visibility of network analyzer insights across all your projects. More details here.

Conclusion

This tool is great for detecting network misconfigurations or problems. With network analyzer insights combined with logging and monitoring, as described in this blog, we can even get real-time alerts and take preventive steps before issues impact business operations.