Automated Internet Speedtests for Distributed Networks

by Stuart Vassey and Colleen Hutson

The Problem

Have you ever been curious if you’re getting the Internet speeds you pay for? Do users complain that their internet connection is slow? Do popular browser and app-based speed tests really provide accurate measurements? We wanted to answer each of these questions.

With restaurant-based technology growing rapidly at Chick-fil-A, the distributed networks that feed our systems have become critical.

Availability monitoring is possible with a ping-based approach, but that doesn’t tell the whole story. Network devices provide additional SNMP metrics with glimpses into packet loss, latency, and real-time throughput, but this data doesn’t confirm maximum available bandwidth. In this post, I’ll describe how we monitor circuit capacity at over 2,000 locations daily.

Solution Requirements

As we set out to find a solution, there were some requirements:

  • No browser-based tests, which might fail to simulate real-world scenarios
  • No third-party software installs due to security risk
  • Minimal impact to business operations
  • Dynamic execution parameters for slow and fast circuits
  • Capable of saturating 100+mbps fiber circuits
  • Capable of running at 2000+ sites simultaneously
  • Results must be viewable in a central location
  • Keep costs low by using existing tools and infrastructure

The Chick-fil-A Solution

Websites like speedof.me provide decent testing capabilities, but we wanted an automated solution that didn’t rely on a browser-based test. Software-based options are available, but we weren’t comfortable installing 3rd party software in our restaurant ecosystem. Instead, we built our own custom solution using a combination of AWS S3, Content Delivery Network (CDN), PowerShell, Windows Management Interface (WMI), Windows scheduled tasks, and LogicMonitor.

Rather than deploy additional testing infrastructure, we rely on our existing CDN and AWS S3 services to provide robust download and upload capabilities. Files ranging from 1.5MB up to 100MB are available to download through local POPs near each of our 2000+ restaurant locations. Once the files are downloaded, they’re uploaded back to AWS S3. We schedule a job to clear the uploaded files daily.

Each restaurant has a Windows machine capable of running scheduled tasks, so we developed a PowerShell script that runs nightly with minimal risk of impacting our business operations. For 15 seconds, the script downloads and saves as much content from CDN as possible, and then the script uploads the same files back to AWS S3 for 15 seconds. The timed results of these requests are averaged, to produce estimated download and upload speeds. We found that it’s important to use appropriate file sizes to produce accurate results. Big files downloaded on a DSL circuit might fail completely and bog down the network, while small files on a fiber network can lead to inaccurate results. We dynamically increase the size of our download and upload files based on the network circuit’s capacity.

Once the Internet speeds have been calculated, we store the results and test status in a custom WMI namespace, which we query using an infrastructure monitoring platform, LogicMonitor.

From LogicMonitor, we’re able to:

  • Identify stores with the fastest/slowest circuits
  • Show long term chain-wide trends from circuit upgrades
  • Estimate circuit utilization by dividing interface throughput by circuit speed

Challenges We Faced

  • PowerShell Overhead - At first, we were seeing unexpectedly slow speeds, which we attributed to slow hardware. Eventually we figured out that PowerShell needed to be optimized to run our I/O intensive script efficiently. Working with MS Support, we identified the visual progress bar as a major drag, even with the script running as a background task. Adding the option $progressPreference = 'silentlyContinue' resolved this issue and increased throughput tenfold at some stores.
  • Dynamic Test Scaling - A one-size-fits-all test wasn’t good at maximizing fast networks without bogging down slower circuits. We built some logic into our script to dynamically scale file sizes to provide a more flexible testing method.
  • Recording Results – By storing test results to a custom WMI namespace, we can easily query for the data using our monitoring platform. We hadn’t worked with custom WMI namespaces before, so this took some trial and error. You can see our approach in the attached script.
  • Upload Site - Not many places want to receive 19GB of throw-away uploads daily. We chose AWS S3 and set up a task to regularly clear these files.

Results & Conclusion

With an automated speedtest solution, we’ve created a quick and easy way to understand our restaurants’ real-world network performance. Here are a few ways we’ve used this data:

  • Support staff no longer have to remote connect to stores to run browser-based speedtests.
  • When users report slow Internet speeds, support staff can compare user experiences with historical results.
  • We can prioritize circuit upgrades for locations with slow connections.
  • We have independent and unbiased speed ratings.

In the future, we will likely migrate the speedtest task to our IoT/Edge infrastructure with some minor modifications.

Hopefully our guide helps you design your own automated speedtest solution that provides new insights for your business!

# Authors: Colleen Hutson, Geoffrey Cole, and Stuart Vassey|
# -------------------------------------------------------------------------------------------------|
# Purpose:
# - Script is designed to test the down/up rate of a particular device
# Process:
# - Download speed is determined by downloading files increasing in size from CDN
# - Upload speed is determined by uploading files increasing in size to an S3 bucket
# - Final up/down speed rates are saved to custom WMI paths
# -------------------------------------------------------------------------------------------------|
# ------------------------------------------------ #
# Script Variables
# ------------------------------------------------ #
# Calculated download speed
$_avgDownload = 0
# Calculated upload speed
$_avgUpload = 0
# CDN address
$_cdnAddress = <CDN ADDRESS>
# S3 Bucket URL
$s3EndPoint = <S3 BUCKET ENDPOINT>
# Local file path to save downloads to
$_localDownloadPath = <LOCAL PATH TO SAVE DOWNLOADS TO>
# Track up/download failed (1=failed 0=sucess)
$_spFailureTrack = 0
# WMI Namespace
$_wmiNamespace = <CUSTOM WMI NAMESPACE>
# WMI Class
$_wmiClass = <CUSTOM WMI CLASS>
# Silence up/download progress bar, so to not throttle up/download process
$progressPreference = 'silentlyContinue'
# ------------------------------------------------ #
# First calculate the average download speed
# Download is calculated first because the files that are downloaded will be uploaded to calculate upload speed
# For Loop Logic Explanation:
# - 10 files of known size located at the CDN endpoint
# - Download, in increasing file size order, all 10 files or as many as possible 15 seconds
# ------------------------------------------------ #
# File path endings
$downloadPaths = "1_5mb", "2_5mb", "5mb", "10mb", "20mb", "25mb", "40mb", "50mb", "75mb", "100mb"
# Creates the file path for downloaded objects if it does not exist.
If (-not(Test-Path -path $_localDownloadPath)) {
New-Item $_localDownloadPath -type directory -ErrorAction SilentlyContinue | Out-Null
}
# Create lists to store download time and downloaded file sizes to
[System.Collections.ArrayList]$downloadTimes = @()
[System.Collections.ArrayList]$downloadSize = @()
try {
$index = 0
# While less than 15 seconds have elapsed and files available, download file of greater size than preceeding one
For ($startTime = Get-Date; ((Get-Date)-$startTime).TotalSeconds -le 15 -and $index -lt 10; $index++) {
# File download start time
$downStart = Get-Date
# Download file
Invoke-WebRequest $_cdnAddress/$($downloadPaths[$index]) -OutFile "$_localDownloadPath\$($downloadPaths[$index])"
# Calculate time to download file and save to list
$downloadTimes.Add(((Get-Date)-$downStart).TotalSeconds)
# Calculate size of file downloaded and save to list
$downloadSize.Add((Get-Item -Path $_localDownloadPath\$($downloadPaths[$index])).Length/1024/1024)
}
# Calcaulte average download speed
$downloadSum = 0
for ($i=0; $i -lt $downloadTimes.Count; $i++) {
$downloadSum = $downloadSum + ($downloadSize[$i] / $downloadTimes[$i] * 8)
}
$_avgDownload = $downloadSum / $downloadTimes.Count
}
catch {
$_spFailureTrack = 1
}
# ------------------------------------------------ #
# Second calculate the average upload speed
# For Loop Logic Explanation:
# - Use files that were downloaded for calculating the average download speed
# - Upload, in increasing file size order, all files that were downloaded or as many as possible 15 seconds
# ------------------------------------------------ #
try {
# Get files from SpeedTest folder and sort from smallest to largest file
$uploadPaths = Get-ChildItem $_localDownloadPath\*mb | Sort-Object -Property Length
# Create lists to store download time and downloaded file sizes to
[System.Collections.ArrayList]$uploadSize = @()
[System.Collections.ArrayList]$uploadTimes = @()

$index = 0
$index
# While less than 15 seconds have elapsed and files available, upload file of greater size than preceeding one
for ($startTime = Get-Date; ((Get-Date)-$startTime).TotalSeconds -le 15 -and $index -lt $uploadPaths.Count; $index++) {
# Start time for upload
$upStart = Get-Date
# Create S3 URL and upload file
$s3URL = "$s3EndPoint/$env:COMPUTERNAME/$($uploadPaths[$index].Name).txt"
Invoke-RestMethod -Uri $s3URL -Method Put -InFile $uploadPaths[$index].FullName
# Calculate time to upload file and save to list
$uploadTimes.Add(((Get-Date)-$downStart).TotalSeconds)
$uploadSize.Add($uploadPaths[$index].Length/1024/1024)
}

# Calculate average upload speed
$uploadSum = 0
for ($i=0; $i -lt $uploadTimes.Count; $i++) {
$uploadSum = $uploadSum + ($uploadSize[$i] / $uploadTimes[$i] * 8)
}
$_avgUpload = $uploadSum / $uploadTimes.Count
}
catch {
$_spFailureTrack = 1
}
# ------------------------------------------------ #
# Third save the up/down speeds to custom WMI Objects
# ------------------------------------------------ #
# Check if wmi namespace exists and create if not
If (Get-WmiObject -Namespace "root\cimv2" -Class "__NAMESPACE" | Where-Object { $_.Name -eq $_wmiNamespace }) {
WRITE-HOST "The root\cimv2\$_wmiNamespace WMI namespace exists."
} Else {
$wmi = [wmiclass]"root\cimv2:__Namespace"
$newNamespace = $wmi.createInstance()
$newNamespace.name = $_wmiNamespace
$newNamespace.put()
}
# WMI Class
If (Get-WmiObject -List -Namespace "root\cimv2\$_wmiNamespace" | Where-Object {$_.Name -eq $_wmiClass})
{
# The class exists, Let's clean it up.
$GetExistingInstances = Get-WmiObject -Namespace "root\cimv2\$_wmiNamespace" -Class $_wmiClass
If ($GetExistingInstances -ne $Null)
{
Remove-WMIObject -Namespace "root\cimv2\$_wmiNamespace" -Class $_wmiClass
}
}
# Publish WMI Properties
$_subClass = New-Object System.Management.ManagementClass ("root\cimv2\$_wmiNamespace", [String]::Empty, $null);
$_subClass["__CLASS"] = $_wmiClass;
$_subClass.Qualifiers.Add("Static", $true)
$_subClass.Properties.Add("Name", [System.Management.CimType]::String, $false)
$_subClass.Properties["Name"].Qualifiers.Add("Key", $true) #A key qualifier must be defined to execute 'Put' command.
$_subClass.Properties.Add("Upload", [System.Management.CimType]::Real64, $false)
$_subClass.Properties.Add("Download", [System.Management.CimType]::Real64, $false)
$_subClass.Properties.Add("SpeedTestFail", [System.Management.CimType]::Real64, $false)
$_subClass.Put()
$keyvalue = "Bandwidth"
$Filter = 'Name= "' + $keyvalue + '"'
$Inst = Get-WmiObject -Class $_wmiClass -Filter $Filter -Namespace root\cimv2\$_wmiNamespace
$Inst | Remove-WMIObject
$WMIURL = 'root\cimv2\'+$_wmiNamespace+':'+$_wmiClass
$PushDataToWMI = ([wmiclass]$WMIURL).CreateInstance()
$PushDataToWMI.Name = $keyvalue
$PushDataToWMI.Upload = $_avgUpload
$PushDataToWMI.Download = $_avgDownload
$PushDataToWMI.SpeedTestFail = $_spFailureTrack
$PushDataToWMI.Put()