How Intricately Creates Cloud Footprints

Intricately
3 min readNov 7, 2017

--

Since 2013, Intricately has been building and refining our Global Sensor Network. The network includes thousands of physical deployments spread across 6 continents and 100+ countries. These sensors help us map the digital world.

At the heart of it are gateways, which we like to call “on ramps” to the Internet. Gateways are operated by Internet Service Providers like Verizon, Comcast and AT&T and give your device the ability to send and receive information online.

Intricately monitors millions of Internet Gateways

Intricately’s sensor network monitors the flow of information through these gateways, keeping track of changes as they occur. This helps our customers visualize the relationship between content owners (companies like Netflix, Facebook or Pinterest), the delivery platforms they choose (providers like AWS, Akamai or Google Cloud), and the volume of traffic each delivery platform serves (total spend on a given product).

Mapping a company’s digital footprint

Let’s take Intricately for example. When you type intricately.com into your browser window, many things are happening behind the scenes. First, our DNS provider, Amazon Route 53, connects you to our site by mapping the address you typed (intricately.com) to the IP address of our web server .

Next, our Hosting provider, Heroku, makes sure our website is there when you visit it by storing all of its component parts on servers connected to the Internet. If you happened to view an image or video while browsing, you interacted with our CDN provider, Amazon Cloudfront.

Both Hosting and CDN (as well as many other cloud-based services) rely heavily on Data Centers to store the information they deliver across millions of websites and applications. These centers can either be physical (if a company chooses to self-host its content) or cloud-based depending on the provider or the region.

An overview of Intricately’s digital footprint

By monitoring the digital presence of 3.4M+ companies from website to data center, our sensor network can answer questions like:

  • Which companies use Amazon Web Services to host their online presence?
  • How much are these companies spending, and how has that changed over time?
  • Where are Facebook’s data centers located and who operates them?

Our Global Sensor Network vs Web Scraping

If you’ve purchased technology data before, you may be familiar with “web scraping” — a method commonly used to determine the SaaS applications on a particular website. In this case, many products have a unique tag or snippet that must be added to a site’s code in order to properly run the software. By looking for these code snippets across millions of domains, companies like Builtwith and SimilarTech are able to determine which products are installed on which websites.

There are many challenges to this methodology, let’s cover four big ones below:

  1. Scraping is binary. From our previous example, web crawling may tell you that intricately.com uses Heroku for hosting, but it can’t tell you how much we’re spending or how the product is being used. Larger businesses use multiple hosting providers to power their websites, apps, APIs and more, so understanding this relationship is extremely important.
  2. Scraping leads to false positives. Adding code to a website is easy, and there is minimal performance cost to leaving it there. Many sites have code that is no longer active or code that has been added but never utilized. This means you have no insight into which SaaS applications a company is actually using.
  3. Scraping is domain-based. Take a company like Nike, which operates hundreds of domains around the world. Web scrapers will treat each of those domains as a distinct entity, inflating the count of deployments and giving a false sense of usage and breadth.
  4. Scraping misses many products. Web scraping is limited to products that can be observed from the website. Many providers like Google Cloud, Amazon Web Services and Neustar do not require code snippets, because they operate behind a web server. To identify these products AND how much a business is spending, you need to approach the problem from a whole new angle.

Take our data with you!

Get a comprehensive view of any prospect’s digital presence directly from their website. Install our Chrome Extension today!

This post originally appeared on the Intricately Blog.

--

--

Intricately

Real-time data-driven market intelligence for Cloud, Mobile, and SaaS ecosystems