gCloud Genie Aces Network Observability

Tanmay Ravindra Joshi
Google Cloud - Community
7 min readDec 16, 2022

Recommended: Please read the Introduction here: Tech blogs with a twist!

“Good Morning Zach!” The loud, booming voice of Atkin Bainbridge, the head of Cloud Operations and DevSecOps, reverberated through Zach Kennedy’s corner office one morning, at Cygnos Imperial Bank. Zach looked up, as the tall, bestubbled man strode in, with his head of Cloud Networking, Dean Powell, in tow.

“Atkin! Hi! What brings you to these ‘ere parts, first thing in the morning!” Zach responded affably, a twinkle in his eye and a smile on his face, as the two shook hands warmly. “All well?”

They were two sides of the same coin, Atkin, a champion in the Infrastructure and Networking space, and Zach acing the application layers above. They both were indispensable to Cygnos Imperial bank in the ongoing cloud transformation journey. They had a great working equation and ended up effectively complementing each other, to ensure the smooth functioning of the enterprise.

“Yeah, yeah, all’s well, don’t worry”, Atkin responded.

“But there are a few nagging issues that are plaguing me for some time, and I thought I’ll discuss them with you”.

“Sure. Shoot!” Zach responded.

Unknown to Atkin, in a tiny minimized window on Zach’s Laptop, the gCloud Genie also perked up!

“Sometimes I get the feeling that I’m flying blind! I somehow don’t get the feeling that my network is fully visible to me! I find myself shocked looking at the egress bills! There are times when application owners have barged into my cabin complaining about latency! We’ve struggled sometimes to establish connectivity between two services. We’ve fought with ping drops. We’ve faced sub-optimal firewall configurations. Now Dean here is a master at what he does, and most of the time we’ve been able to troubleshoot and identify issues the old fashioned way. But I go back to the promise of the cloud — ease of use! Not logging into consoles and manually finding issues hop by hop!

“Don’t worry brother, I got your back! I have just the tool for you! Let me just hook up the monitor and show it to you right now! You both are so gonna love this, I guarantee you, by the time I’m done, you’re gonna offer to take me out for beers tonight!”

“A’right, I’m game! Impress me!” Atkin enthused.

“Okay, here goes, the tool is called ‘Network Intelligence Center’. Folks at GCP created it specifically to make lives of millions like Dean, easy! It’s a single console for Google Cloud network observability, monitoring, and troubleshooting. It reduces the risk of outages and ensures security and compliance.

Let me just fire up the first module ‘Network Topology’.”

That was a cue for gCloud Genie to fire up the requisite window.

“See this? Cool, ain’t it! An entire layout of your network, the LBs, the Internet Regions where the traffic goes out to, the On Premises Connectivity, all of it in one single map. And what’s more, it supports drill-down. So you can click on a particular region, and drill down from Region → Zone → Instance Groups → Individual Instances!

There! You wanna know the incoming and outgoing traffic bandwidth? Here you are! This shows it at a regional level, but you can drill down up to a per-VM level to see the ingress and egress KB/s! Nifty, ain’t it!

I just switched the view to RTT, and I see latencies. I’ve also zoomed in to a VM level, to show you VM level metrics! It also fires up a cloud monitoring pane in-context! You can clearly see your specific VM talking to three regions over the Internet! How’s that!

Similarly you can browse for packet-loss just by selecting the Radio Button on the left!

Atkin and Dean were speechless!

“Damn! That’s amazing!” Dean, finally managed to find words! “Show me more!”

“Here you go! Now Atkin, I heard you complaining about egress bills, didn’t I? There’s a specific view to show top egressing instances which is tailor made for your problem.

Bingo! Top Egress Instances, all conveniently sorted for you!

What’s more, you can choose by type of Egress from the dropdown, and choose between overall, cross-zonal, to Internet and hybrid (to on prem).

Besides the obvious billing considerations, this is also a handy tool to check any unexpected egress, which can be data exfiltration as the result of an attack!

You can also set the scope of Network Intelligence Center to an individual project, or a ‘scoping project’ through which you can monitor the entire enterprise from a single pane. Both ways.

“Wow!” Atkin was impressed! “These blokes at GCP come up with some pretty handy stuff man!!”

“Of course! There’s a reason why we’re betting on GCP to go big on our cloud journey!

But wait, let me show you some more cool stuff!

You’re aware that we’re launching overseas ops in India next month, starting with Mumbai, right? How would you like it, if you can predict the latency of an end user arriving over the Internet from Mumbai to our home regions in the US and Europe?”

“Huh?!” was all Atkin could manage to utter!

“Voila! There’s a module called ‘Performance Dashboard’ which shows you the average RTT from Internet Endpoints located around various points in the world (see the number of dots there on the map), to a GCP region. For instance, if I click on Mumbai…

I can see the latencies from Mumbai’s Internet Endpoints to our home regions! This helps with the placement of workloads for latency sensitive applications. Also, the cool bit is that you can go back up to 6 weeks, with this graph!”

Here, let me show you the ‘Connectivity Tests’ module.

The source endpoint for connectivity test can be VM instance, IP, Cloud Function 1st Gen, Cloud SQL and GKE Master cluster. The destination endpoint can be VM Instance, IP, Cloud SQL and GKE Master cluster.

Connectivity Tests now includes a feature that verifies connectivity from a VM or an IP address to a Private Service Connect endpoint

“Whoa, Whoa, hang on! Are you telling me that GCP logs into a VM and initiates traffic to the target VM to test connectivity, because I don’t want them doing that!” Atkin interjected.

“That’s a good question, and no they don’t. As a CSP, they restrict themselves to the Cloud Data Plane only. They do not access any customer workload. The tests are of two types. Connectivity Tests performs a reachability analysis that evaluates the Google Cloud resources in your testing path against an ideal configuration model. It is augmented by the live data plane analysis feature, which sends packets to verify the state of the data plane and provide baseline information for supported configurations. The probing mechanism for live data plane analysis does not involve the guest OS and is fully transparent to the user. Probes are injected on behalf of the source endpoint to the network and are dropped just before being delivered to the destination endpoint. Probes are excluded from regular network billing, telemetry metrics, and flow logs.”

There is also a ‘Network Analyzer’ module that automatically runs your networking configuration through a set of known best practices filters, and shows if you’re deviating anywhere.

“Awesome!” Atkin beamed! This is outstanding stuff!

“And finally, last but not the least, the ‘Firewall Insights’ module to provide details about the firewall configuration.”

This module is great to identify firewall rules that overlap existing rules, rules with no hits, and unused firewall rule attributes such as IP address and port ranges. You can get insights into Shadowed firewall rules, Overly permissive rules, Allow rules with no hits, Allow rules with unused attributes, Allow rules with overly permissive IP addresses or port ranges, and deny rule insights with no hits during the observation period. This can help you identify firewall misconfigurations for firewall rules containing IPv4 or IPv6 address ranges. Also, you can optimize firewall rules and tighten security boundaries by identifying overly permissive allow rules and reviewing predictions about their future usage. You also get details on deny rules with hits, which can serve as a first level reconnaissance in case of a suspected attack (an exponentially high/ growing deny rule hit can be the sign of an attempted attack).

“And, scene”, Zach bowed rather theatrically as he ended his demo and closed his browser.

Dean and Atkin, both sat back in their seats, thoroughly impressed!

“Okay! Here’s how this is going to work. Dean here is going to scoot back to his laptop, and switch on Network Intelligence Center. He’s going to spend the rest of the day knocking everybody’s socks off, with this newfound Network Troubleshooting and Observability superpowers. In the evening, I’m going to bring him here to you, and the three of us are going to head out for beers!

You were right Zach, after you’ve just made our lives so much easier, we just can’t get by without offering you beers! So, 8 o’ clock tonight, shall we say?”

“Deal!”

gCloud Genie beeped with pleasure on Zach’s Laptop, beaming at the heaps of praise on his master!

--

--

Tanmay Ravindra Joshi
Google Cloud - Community

Just completed 18 years of my professional life. Techie by profession, traveler & blogger by passion