Bringing Internet Speed Data Online Part 1

Surveying the Current Data Landscape

Marc Richardson
City as a Service
6 min readSep 21, 2020

--

Good internet speed data is hard to find. Using Stae’s Civic Intelligence Platform, we can make internet speed data more accessible to public officials, community advocates, and others working to correct digital inequities.

In the wake of COVID-19, local governments everywhere have scrambled to facilitate the transition to a remote lifestyle in which socializing, working, and learning increasingly occur online. In a rush to adjust to this new reality, some governments have run headlong into a hard truth: many communities, both rural and urban, continue to struggle with poor internet connectivity. Local decision-makers and researchers who want to understand why are hindered, in part, by a lack of reliable, transparent, up-to-date data.

As part of our mission to help civic leaders leverage their data into efficient workflows that save resources and drive insight, Stae is working with Measurement Lab (M-Lab)—a non-profit that aims to advance Internet research by providing free, open Internet measurement data — to explore how we can make reliable internet speed data more accessible, portable, and interoperable alongside other community indicators. The goal is to support public officials and other stakeholders who can use the data to bridge the digital divide at the local level.

This two-part blog highlights the work that Stae has begun with M-Lab thus far and looks to the horizon for how communities can partner with Stae to bring better internet to their residents. Part 1 introduces the problems, as well as the opportunities, inherent in the current data landscape, while Part 2 discusses how communities, partnered with Stae, can overcome their data obstacles on the path to better internet connectivity.

The Devil Is in the Data: Why Current Data Doesn’t Cut It

Where does a city or county government look to find internet speed data for their local community? The Federal Communications Commission (FCC) publishes national internet data, including some speed metrics, from its Form 477 survey, but that data is notoriously unreliable. For example, skeptics note that the survey allows internet service providers (ISPs) to self-report the speeds they offer in certain areas, which can significantly deviate from the actual speeds experienced by residents.¹

Map of FCC Form 477 Data.

The FCC also collects and publishes speed data through its Measuring Broadband America (MBA) program. Yet, some stakeholders argue that the program “does not fully meet the best practices for performance measurement … [because it] lacks comprehensive transparency that would allow a third party to replicate and independently verify MBA test results.”²

The unreliability of the FCC’s data is problematic. Decisions on where to allocate funds to support local internet initiatives, such as the deployment of high-speed broadband networks, are often based on maps created with Form 477 data.³ Stakeholders argue that, due to gaps in the data, some communities with poor connectivity are given insufficient funding, many of which tend to be low-income areas with marginalized populations.⁴

Fundamentally, flawed FCC data makes local decision-makers’ jobs more difficult. Civic leaders that want to improve local internet might not be able to identify areas in their communities disadvantaged by poor connectivity. Local regulators might be unable to audit ISPs that have pledged to improve service. These obstacles have led to some clever data collection initiatives from state and local governments (more on that in Part 2). Yet, communities that lack the resources or capability to orchestrate such initiatives might be left in the lurch.

Towards a Better Alternative: M-Lab’s Internet Connectivity Data

One promising alternative to FCC data is M-Lab. M-Lab is an open-source project that aims to, among other things, “provide an open, verifiable measurement platform for global network performance” and “host the largest open Internet performance dataset on the planet.” Its members design and maintain open-source internet performance tests, such as the Network Diagnostic Tool (NDT), which measures the download and upload speed of an internet connection. M-Lab’s speed test is the test that appears when you google “speed test.” Suffice to say, it collects a lot of connectivity data globally.

M-Lab Speed Test Available Online.

What differentiates M-Lab’s NDT data from the FCC’s data is not only the sheer volume of data but also the robustness and transparency of its methods of measurement. As mentioned above, the performance metrics that ISPs self-report in the Form 477 survey can differ from the actual speeds experienced. The FCC’s MBA data is not self-reported but is collected only annually from a sample of households and measures connectivity in a way that might not provide a complete picture of the internet experience.⁵

M-Lab’s NDT data, collected through the use of multiple measurement methods, provide a fuller picture of a connection’s quality.⁶ Its position is that open, transparent, and inspectable measurements and methods will produce more reliable and trustworthy data.

M-Lab makes its data freely available to the public, but not necessarily in a manner that is intuitive for the non-technical.⁷ In addition, along certain dimensions (such as geography), M-Lab data is not as granular as local decision-makers might need.

To improve the accessibility and utility of this valuable source of open data, Stae has begun working with M-Lab to make its data available as a “community source” on Stae’s Civic Intelligence Platform (via an API that M-Lab is developing). This source would be accessible to any local government or community user on Stae’s platform and could be automatically filtered to a user’s geography or jurisdiction. Stae is also exploring how we might support community use of M-Lab’s tools to produce and ingest hyperlocal, community-specific internet data (such as more precise location data and pricing data for local internet packages) in a standardized way.

In Part 2 of this blog, the initial efforts toward these goals are discussed along with the next steps that Stae and M-Lab are taking to help communities provide better internet experiences. Read on!

If Stae can support your local connectivity initiative with granular connection quality data, or if you are facing challenges working on another data-driven public-sector problem, please reach out to us at we@stae.co — we may be able to help.

Endnotes:

  1. A study by the Institute for Local Self-Reliance noted how the survey incentivizes “[l]arge, de facto monopoly providers … to overstate their coverage and territory to hide the unreliable and slow nature of their service in many communities.” For a striking comparison between FCC data and other internet data sources, see this visualization produced by the Georgia Broadband Deployment Initiative.
  2. M-Lab donated resources to the FCC MBA program for many years, hosting the program’s “off-net” servers until late 2019.
  3. For example, in February 2020, the FCC approved a $20.4 billion Rural Digital Opportunity Fund that directs, based on the FCC’s data, $16 billion to Census blocks that have no internet service and $4.4 billion to Census blocks deemed “partially served.”
  4. See, e.g., S. Derek Turner, Digital Denied: The Impact of Systemic Racial Discrimination on Home-Internet Adoption, (Free Press: December 2016), https://www.freepress.net/sites/default/files/legacy-policy/digital_denied_free_press_report_december_2016.pdf; Robert Fairlie, “Race and the Digital Divide,” UC Santa Cruz: Department of Economics, UCSC, September 11, 2014, https://escholarship.org/uc/item/48h8h99w.
  5. Specifically, MBA data measures connectivity only to servers within the last mile of the ISP’s network.
  6. Specifically, M-Lab’s NDT data provides measurements of the reliability of the transmission control protocol (TCP) and its connections to servers that reside outside of the last mile of the ISP’s network, where other popular tests seek to measure the total link capacity of a connection.
  7. Currently, M-Lab stores the raw data on the Google Cloud and — in a more structured and accessible form — through Google’s BigQuery.

--

--