Citations: Tracking the Flow of Policy

Anastassia
FiscalNoteworthy
Published in
7 min readJul 30, 2020

FiscalNote’s Approach to Finding Connections in U.S. Federal Data

The recently enacted CARES Act created a 2.2 trillion-dollar stimulus package to support the US economy in light of COVID-19. To get this legislation passed, numerous individuals and organizations called, emailed and tweeted their legislators to share their opinion. The Act mentioned the United States Code — the primary body of US laws — more than 200 times.

Down the road, these changes will affect new regulations. But before these new rules can take place, the public will have another chance to voice their opinions during the open comment period. While this process may seem opaque, it is as important as contacting your legislators; over the last few years, we’ve seen the public get increasingly involved in this step. At FiscalNote, we’ve found that effective organizations use both to enact policy. To track this process end to end, we built a Policy Knowledge Graph that connects bills, laws and regulations.

Background

Overview of policy-making steps

The United States Code (USC) is the main body of US (non-criminal) laws. Bills will propose modifications to the code by editing existing sections or proposing new sections. If a bill is enacted, it will become a Public Law (PL) — an intermediate step before changes are encoded into the USC.

Example of Modifying Language from HR 7147

While legislation will set budgets and provide high level direction, it usually lack technical details — these are specified by executive agencies in regulations and encoded in the Code of Federal Regulations (CFR). In order for the regulations to change, two things almost always must happen: first, the agency must specify which USC sections authorize the change; second, they must hold an open comment period (a set of proposed changes with related documents is grouped into docket) where the public can comment on the changes (and the Administrative Procedures Act says agencies must allow for the commenters to express concern before putting forth the final regulation).

One of the more complex statements for 10 CFR 2

Connecting the pieces

If you care about a certain policy, you’ll likely need to track it from a proposed bill to a promulgated rule. To analyze this process, we created a representation within a knowledge graph that captures the steps of this process: the proposed modification between bills and the USC, the transformation from bills to Public Laws to the USC, the authority statements between bills and Public Laws and the CFR, and the modifications to the CFR proposed in Dockets. We modeled the data with the Neo4j graph database. The policy entities represented in the knowledge graph can be visualized as below:

Components of our Knowledge Graph

A graph is a way to gather and link information for the purpose of uncovering new insights and is composed of nodes and edges. The nodes describe what the entity they represent is, while the links represent a semantic relationship between the nodes.

The nodes in our graph represented documents, such as bills, public laws, statues and individual sections of the USC and CFR. The edges represented the possible relationships between documents. For example, there can be an authorizes edge between a USC and a docket, since the USC section is providing the authority for the regulation. We use the authorizes relationship both for USC-CFR and USC-Docket because the Docket needs authorization to change the CFR, but once the Docket is finalized passed, we want to track the direct USC-CFR relationship. We’ll discuss how we designed our graph to support a policy knowledge graph in a future post.

The graph is a natural representation for this dataset because it reflects how policy flows through different parts of the government. It also allows us to flexibly query for relationships between different document types. With a simple query, we can answer a question like: “What dockets are 2–3 edges away from a given bill?” The same question would be more difficult to answer using a traditional database.

To create the graph, we automatically collected the data from several sources, because it is not available in a readily accessible format like a CSV. We created custom scrapers for each data source — since each dataset uses a different format. After parsing several thousand documents, we had over 100,000 nodes (i.e. Bills, Dockets, PLs, USC and CFR sections) and over 750,000 connections between them. As far as we know, this is the first effort to connect all the data in this way. Using our graph, we conducted several types of analysis:

End-to-End Connections: Dockets to Bills

Our initial motivation for this project was to discover connections between upcoming legislation and regulations. For example, if you were interested in the regulatory docket ED-2015-OPE-0020 — “Program Integrity and Improvement” (PII), shown on the left below, from 2017, you could discover the recent bill S 2339 (Higher Education Reform and Opportunity Act of 2019), shown on the right, modifies two sections of the USC that both authorize a CFR section (34 CFR 668- Student Assistance) that is, in turn, modified by PII.

Dockets to Bills

Or, we can follow this relationship in the opposite direction: if you were previously interested in the 114th session (2015–2016) bill HR 2192 (Protections and Regulation for Our Students Act or the PRO Students Act), you would discover PII in 2017, since HR 2192 modified many of the code sections authorizing what PII is modifying.

Bills to Dockets

Thus, the Knowledge Graph gives us a new mechanism for discovery. Using your past activity on legislation and regulations, we can recommend new policies that you should act on.

Discovering Important Sections

We can use the connections to analyze macro trends in the policy-making process. For example, we were interested in understanding which sections of the USC are frequently targeted by bills. To calculate this we counted the edges between bills and USC sections during the last two sessions. We found that the three most popular sections were:

  • 42 USC 1395x: General Provisions around Medicare (cited by 99 bills)
  • 18 USC 922 Unlawful Acts Regarding Firearms (cited by 92 bills)
  • 8 USC 1101 General Provisions around Immigration(cited by 88 bills)

These matches are not surprising as all three cover popular and contentious issues. We could also redo this analysis targeting bills sponsored by a particular legislator to understand which issues they care about the most.

Dramatic view of current session bills proposing modifications to 42 USC 1395x

Issues That Go Together

We hear blog posts need fun graphics

Finally, we can use the graph to find correlations between which USC sections were edited together. We discovered some interesting examples:

42 USC 300gg-4: (Prohibiting discrimination against individual participants and beneficiaries based on health status), and 29 USC 1185: (Standards relating to benefits for mothers and newborns) are often modified together because they concern different aspects of (maternal) medical leave.

10 USC 2130a (Financial aid for Nurse Offices) and 37 USC 910 (Payment to Reserve members of the armed services) is a similar pair and a part of a larger group of sections that cover military funding.

For other sections, the connections may be less obvious. For example, 49 USC 50101 and 23 USC 313 both talk about incentives for using steel and manufactured goods made in America, but the first deals with the Federal Aviation Administration and the later with the Federal Highway Administration. While they appear separately a few times, it appears that most legislation attempts to cover “transportation” topics together.

While some of these examples may seem obvious, they may be difficult to discover by hand (especially if the pairs come from very different titles). We can use this data to find the cases where one is edited but another isn’t to help discover which aspects of policy work together in unexpected ways. In the future, we will expand the method to automatically find clusters of similar sections. It will, also, be interesting to complement this technique with text-based similarity.

This is only the beginning. These are just a few examples of the kind of insight we can find infer from the connections between policies. Using our graph of over 100,000+ nodes and growing, we look forward to discovering other macro and micro trends in the world of legislation and regulations.

--

--

Anastassia
FiscalNoteworthy

Ramblings on Data Science, Mental Health and Life