AI in Government: Part Two — Data Mining

RS21
RS21 Blog

--

This is the second installment of a three-part series on how Artificial Intelligence (AI) can be leveraged to build mission-directed insights for government agencies.

If you haven’t read part one of the series, you should. It provides an understanding of how we use AI as a tool to help government agencies become more efficient and effective. It also gives an overview of how cutting-edge Natural Language Processing AI can be used to make once tedious tasks of contextual information extraction more approachable.

In this post, we continue our discussion of how AI enhances individual capabilities with the next concept of AI: Data Mining.

We use data mining techniques to extract the information you need from complex data to make better and faster decisions based on our best scientific understanding. The flexibility of our data mining toolbox allows us to address a variety of issues and support the missions and objectives of many federal agencies.

Why A Powerful Data Mining Toolkit is Essential

Data mining is a long-standing method, but in the last 10-15 years it has become an important subfield of computer science. This is in part due to the flood of data generated each day and the ability to quickly process it through new AI approaches.

We can think of data mining as the ability to semi-automatically or automatically detect patterns, correlations, and anomalies in data. A classic example of data mining is used by supermarkets to identify purchase associations between shoppers and ads to ensure weekly fliers highlight deals that bring customers into stores.

Cyber security. © geralt/Pixabay

Banks use data mining to identify fraudulent behavior. Using factors like location and account holders’ spending habits, the bank can estimate if new transactions for a particular account are reasonable. For example, if the bank noticed that a college student was spending money like a CEO, they would flag it as a concern and investigate further.

As effective data mining mechanisms require numerous types of algorithms, our data mining toolkit is quite expansive. Our data science team uses regression, classification, rankings, clustering, association, anomaly detection, and dimension reduction techniques to identify and extract patterns that answer important questions in data, like which intelligence to prioritize in counter-terrorism activities or how responses to particular drugs differ based on genetics and environmental conditions.

RS21 Data Mining Toolkit

To give you a couple straightforward examples, below are some questions and the respective approach required.

Regression, Classification, and Anomaly Problems. © Alicia Frudakis/RS21

How It Works

There are more approaches we could consider, but those described above primarily drive our data mining methods at RS21. The beautiful thing about using this suite of techniques to detect signals in a data set is that each approach produces different insights, which can be woven together to tackle more complex questions and produce more holistic understandings.

Let’s take a single data set to illustrate how this works. Below, we consider youth violence in a community and apply our data science toolkit to illustrate how each approach leads to a different insight.

Each of these questions produces different information from a similar data set — that’s really powerful!

RS21 Data Mining Toolkit example data set. © RS21

How Data Mining Enhances Government Capabilities

“You can have data without information, but you cannot have information without data.”
— Daniel Keys Moran

Hiring and Retention

Due to the high costs of recruitment and hiring new personnel, it’s important for government agencies to not only hire qualified candidates, but also to hire people who will stay with the agency for extended periods of time.

Through data mining, we can help identify those recruits who are likely to stay and pinpoint the most important incentives to promote employee retention. With the size of workforce many government agencies manage, this information is critical to building a productive team with low turnover.

Mental Health

The members of the military work in high stress environments, which can contribute to the development of mental health disorders. To support their troops, the Army regularly conducts assessments to gauge how personnel are handling their postings. These assessments lead to the accumulation of a lot of data, but in most cases the data is not jointly analyzed. Using association rules, the Department of Defense could begin to data mine for early indicators of mental health concerns and then provide early interventions to ensure personnel’s safety, wellbeing, and ability to serve.

Taking it a step further, assessments could also be used to identify which applicants are most resilient to the stresses of deployments and ensure the best post placements for each.

© Alicia Frudakis/RS21

Safety and Crime Prevention

The Bureau of Alcohol, Tobacco, Firearms, and Explosives (ATF) often deals with large amounts of dossier data on offenders, their crimes, and known associates. Each dossier contains massive amounts of straightforward information, but the patterns between dossiers is less intuitive.

When we string together information in the dossiers and apply anomaly detection algorithms — one of the techniques in our data mining toolkit — we can help ATF better identify how a group of people (e.g., gangs, mobs, etc.) might be involved in criminal activity.

Public Health and Nutrition

The Center for Nutrition Policy and Promotion (CNPP) provides dietary information and guidelines to improve the health and wellbeing of Americans. This is not an easy task as every person has unique dietary needs. Some people need more protein, while others need more carbohydrates; some people can’t have sugar, and others have allergies. There could almost be a custom diet for every person on the planet.

To help simplify and promote CNPP’s work in promoting good health, we can use classification and clustering techniques to identify individuals’ dietary attributes (e.g., metabolic rate, glucose production, cholesterol, etc.) and then suggest appropriate diets.

Similarly, classification and clustering techniques could help the Department of Health and Human Services (DHHS) identify the most appropriate medical services for each individual patient based on health histories and responses to medications. Clustering is especially helpful in analyzing disparate data, such as environment and lifestyle information, that may be pertinent to effectively treating chronic conditions, like diabetes or pulmonary disease.

© Kevin Ku/Unsplash

Labor and Employment

Data comparison allows users to determine the effect of policies or how certain parties rank within a group. Ranking techniques allow large data sets to be sorted via multiple variables to determine value. Using ranking to sort data allows an agency to direct their analytical efforts to the most successful policies to reinforce positive change.

Dealing with everything from occupation safety and wages to reemployment services and economic statistics, the Department of Labor (DOL) strives to promote employment opportunities and work-related benefits, rights, and safety. The DOL collects information that can be analyzed via ranking algorithms to identify occupations that are the safest, highest paying, and most likely to grow.

There is also opportunity to investigate deeper meanings. For example, ranking algorithms can identify the types of reemployment services that best position job seekers for favorable outcomes, helping the DOL make faster, smarter policy and service decisions.

Leveraging Data Mining for Your Agency

Data should be your ally. Don’t let it sit in silence. The government space is inundated with data waiting to be uncovered and used to help inform decisions. Whether it’s identifying intervention strategies to mitigate the onset of mental health disorders or enhancing health and nutrition in the United States, the data is there waiting. Let’s unleash it!

The next installment of this blog series focuses on AI signal processing. Take a look at how we process signals and use data science to identify meaningful patterns.

RS21 develops interactive data analytics and visualization products.
We blend an advanced computational capability with a network of world-class experts to provide actionable insights to government organizations including:

  • Department of Homeland Security (DHS)
  • Cybersecurity and Infrastructure Security Agency (CISA)
  • Federal Emergency Management Agency (FEMA)
  • Transportation Security Administration (TSA)
  • United States Coast Guard (USCG)
  • United States Agency for International Development (USAID)
  • National Laboratories: Argonne National Laboratory, Idaho National Laboratory, Los Alamos National Laboratory, and Sandia National Laboratory

RS21 is a HUBZone Certified Small Business + GSA Schedule 70 Company

--

--

RS21
RS21 Blog

RS21 is revolutionizing decision-making with data + AI. We believe the power of data can unleash human potential and make a better world. Visit www.rs21.io.