This Week in Data Preparation (10–1–2020)

Nikolaos Konstantinou
The Data Value Factory
3 min readJan 10, 2020

It would be reasonable to assume that not many interesting news would surface during the Christmas holidays. However, in this week’s blog post there are: several links to articles with attempted predictions regarding what will 2020 hold, interesting opinion articles, and news from the data preparation universe, the most notable among which is the California Consumer Privacy Act (CCPA) that became effective on January 1st.

Image by Gerd Altmann from Pixabay

Gary Read (CEO of import.io) shares his predictions for 2020 regarding how industries and organizations will be utilizing alternative web data in 2020. In the article, Read looks at several themes, including the benefits of web data adoption for the real estate industry, travel organizations, and educational and political organizations.

More predictions on what 2020 will hold in this article by Forbes, this article by the chief marketing officer at Denodo, and this article by CIODive.

In this interesting article, it is argued that data governance for self-service analytics should be prioritized to enable citizen data scientists produce advanced analyses. It is asserted that to this end, organizations need to focus on the following data management best practices:

  • Balancing data quality and timeliness. According to Emily Washington, executive vice president of product management at Infogix, “As organizations continue to push the limits with data storage and processing, we see data quality as the underlying theme to ensure they’re leveraging data they can trust.’’
  • Privacy-aware data governance for self-service analytics. “Governance should be a high priority,” said Jen Underwood, an independent consultant and former senior director at DataRobot.
  • Data discovery and data prep with augmented analytics. “Acknowledged as important glue to enterprise software, delivery of a common catalog for finding, provisioning, securing and understanding data and other objects is important to customers,” said Todd Wright, senior product marketing manager of data management and data privacy solutions at SAS.

Explainable AI and the rising role of Knowledge Scientists is discussed in this interview with Andreas Blumauer (CEO and Founder of PoolParty Software) for Forbes. Blumauer makes the case for Semantic AI addressing the concern that many AI solutions work like black boxes and seem to magically generate insights without explanation.

The California Consumer Privacy Act (CCPA) is a state law intended to enhance privacy rights and consumer protection for residents of California. It is similar, but slightly different from the European Union’s GDPR, and is in effect since January 1st. The big price tag that comes with it seems to have created a wave of startups aiming at helping ensure compliance to the law.

On a related note, BigID, a company focusing on privacy-oriented data discovery, raised $50 million in new funding, in less than four months after previously raising a $50M Series C.

Still in the legal domain, this article discusses the perspectives for practitioners and patent-owners on patenting AI and ML applications. We noted that data preparation comes first in a series of aspects.

--

--