The Data Briefing: Can PhD APIs Meet the Challenge of Digitally Transforming Government?

Although I don’t agree with all of Mr. Shetler’s opinions, the former Australian government’s digital transformation lead has some good observations on the challenges facing government. Mr. Shetler recently resigned because of the controversy surrounding an automated tax compliance system which he claims had an unacceptably high error rate. As Mr. Shetler argues, there should be human oversight of the system because of alleged errors in the algorithms and too much reliance on the quality of the Australian government’s data. Mr. Shetler’s assertions have been challenged by other government officials who state that the algorithms underwent rigorous testing.

My purpose in mentioning the Centrelink story is a cautionary example of the challenges of government digital transformation. As more governments digitally transform themselves, there will be an increasing reliance on Application Programming Interfaces (APIs) to build the agency’s digital platforms. APIs are great ways to access Federal government information and to quickly add features to applications. I’ve written before about how pairing Census data that has location information with a third-party mapping service makes a mobile app that is greatly useful to entrepreneurs.

However, most APIs just allow access to Federal government data assets that have been configured for easy delivery through an API. Federal government agencies have vast stores of data that needs labor-intensive and expensive cleaning and structuring so that the data is usable and reliable. What can Federal agencies do to increase the number of APIs that are offered to the public? One solution might be adopting “PhD APIs.”

PhD APIs are so named because these APIs use artificial intelligence and deep learning to analyze unstructured raw data to transform the data into API-friendly data-streams. As one article explains, using PhD APIs is like hiring a team of data scientists to analyze and organize the data using the latest data science methods. More vendors are offering the ability to embed artificial intelligence into applications such as understanding everyday spoken queries or how to analyze an image.

Although PhD APIs offer great capacity, there are still two fundamental issues that have to be considered as demonstrated by the Centrelink controversy illustrates. First, what type of artificial intelligence and deep learning algorithms are being used by the PhD API? Not only are the PhD API’s algorithms reliable but are the algorithms appropriate to the data sets? The second issue is the quality of the Federal data set. How was the data collected and verified? No matter the sophistication of the PhD API’s data science methods, the classic “Garbage In, Garbage Out” principle still applies.

PhD APIs could play a vital role in digitally-transforming government. Even so, human oversight is just as vital to ensuring that the digital transformation efforts succeed in making government more effective and efficient for the American public.


Each week, The Data Briefing showcases the latest federal data news and trends. Visit this blog every week to learn how data is transforming government and improving government services for the American people. If you have ideas for a topic or have questions about government data, please contact me via email.

Dr. William Brantley is the Training Administrator for the U.S. Patent and Trademark Office’s Global Intellectual Property Academy. You can find out more about his personal work in open data, analytics, and related topics at BillBrantley.com. All opinions are his own and do not reflect the opinions of the USPTO or GSA.