“The most exciting new frontier is charting what is already there”
Priyanka Oberoi, Data Scientist in the Commerce Data Service, digs into data to dig up valuable insights.
1. What is your current job title and what are your main responsibilities?
I am a Data Scientist in the Commerce Data Service, which is a public startup in the U.S. Department of Commerce. I talk to teams across DOC and try to see how data science can be used to answer a question, improve a process, or measure an outcome. Since we are a small team, my responsibilities span the data science pipeline. This is a pretty hands-on process where we try to understand a problem, leverage data, implement statistics, and build something that has an impact.
2. How do you use data science in your job? How has data transformed your agency?
The U.S. Department of Commerce is integrating data science in three major ways. We build data products for teams within the agency, we teach data science courses to DOC employees through the Data Academy, and we push for data usability so our open data gets used by researchers, the private sector and non-profits.
3. How many other data scientists do you work with?
There are two data scientists in the Commerce Data Service, under the Chief Data Scientist for the U.S. Department of Commerce.
“Bureaus and teams within the US Department of Commerce often engage private sector companies.”
4. How have you seen data science improve outcomes in your department or team?
Bureaus and teams within the U.S. Department of Commerce often engage private sector companies. Therefore, one of the statistical models we built identifies businesses that have a high likelihood of engaging with DOC, in order to prioritize outreach efforts.
This model can be pivoted to target different kinds of businesses that the team in question is interested in. So far we have rolled this product out to two bureaus, one to identify businesses that are ready to begin exporting and one to identify manufacturers that could benefit from consulting services.
5. What was your first job out of college?
Out of undergrad, my first job was at a healthcare nonprofit where I managed and analyzed data from health insurance companies that provided plans for high risk populations. After going back to school for my graduate degree, I became a consultant where my first project focused on integrating and surfacing the data collected by the DOD Suicide Prevention Office so it was easier to analyze.
“I got to see how making data available and easy to interpret by the people who use it can have a huge impact”
6. What were some key moments/jobs that lead you to your current role?
As a data scientist at the U.S. Food and Drug Administration, I got to see how making data available and easy to interpret by the people who use it can have a huge impact and is often pivotal in finding the right points at which to implement a data science product.
This comment by Randall Munroe describes how I approach data, “the most exciting new frontier is charting what is already there”:
7. What are 3 traits that you would consider to be the most important traits for a data scientist to possess?
Three traits I have found to be valuable as a Data Scientist are: First, being willing to go the long way around by finding the right statistical method, tuning your parameters, and iterating. Second, building something that answers the question being asked, rather than choosing a flashy methodology and working backwards from there to see what you can build. Finally, knowing when the data or the signal doesn’t support what you are trying to build.
“A strong skill set in statistics, math and coding is central to building good data science products.”
8. How would you recommend someone get into the field of data science?
There are a wide array of backgrounds and expertise that lead people into data science. I don’t know that there is one right way to enter the field but a strong skill set in statistics, math, and coding is central to building good data science products.
9. What do you think is the future of open data?
Open methodology, especially for data science products that use statistical models and predictive analytics. Transparency around the cleaning, preprocessing, and modeling methods is valuable for robustness of the end data product but also contributes to the evolution of that data product, which can involve the user and research community.
“We need to be critical of the methods and implementation of models to make sure we are building good products.”
10. Please write any additional thoughts or comments below that you would like to share with our readers.
The application of data science to problems we are trying to solve isn’t an inherently objective process. Statistical models are not objective just because they are driven by statistics and data. This is because models are built by people and built on data collected by people. We need to be critical of the methods and implementation of models to make sure we are building good products.