This Week in Data Preparation (Mar 13, 2020)

Nikolaos Konstantinou
The Data Value Factory
3 min readMar 12, 2020

“This Week in Data Preparation” is a blog post series with links to news from the data preparation market. This is what has happened since last week’s blog post. In short, you will find twelve links in this week’s article: two commentaries on market survey results, three opinion articles, two announcements on the availability of new software, two announcements on partnerships, an announcement of capital raising, a brief history of data quality, and a video tutorial.

Image by xresch from Pixabay

According to a new joint study from Cleo and Dimensional Research, 64 Percent of Enterprises are Planning Data Integration Upgrades. “30% of businesses surveyed estimate losses of up to $1 million each year due to integration-related issues, and nearly half report partner onboarding time spanning a month or more”, said Frank Kenney, Director of Market Strategy at Cleo, in a statement to the Solutions Review magazine.

3 most common data preparation challenges — and how to solve them. Steve Philpotts, General Manager, Data Quality & Targeting, at Experian AUS, comments on the outcomes of the latest “2020 Global data management research” study by Experian. He mentions that the study “shows the negative impact [bad data] has on organizations by wasting resources and incurring additional costs, damaging the reliability of analytics and negatively affecting the customer experience”. “Reaching and maintaining 100 percent accuracy is somewhat unlikely, but it’s entirely possible to come close, and it’s something we should all be striving for”, he continues.

Nipun Agarwal, vice president of research at Oracle Labs, discusses how AI can take the drudgery out of tuning machine-learning models, in this Forbes article.

Alex Kwiatkowski, principal industry consultant, global banking practice, at SAS, discusses monetizing data while complying with regulations in finance, in this KMWorld article.

Data science vs. machine learning: What’s the difference? JP Baritugo, director at business transformation and outsourcing consultancy Pace Harmon offers food for thought in this article for the Enterprisers Project news site.

Google launches Cloud AI Platform Pipelines in beta, to simplify machine learning development. The service is designed to deploy AI pipelines along with monitoring, auditing, version tracking, and reproducibility in the cloud.

Talend announced in a press release the availability of Talend Cloud in Microsoft Azure Marketplace.

iFarm, in cooperation with Poteha Labs and developers from Catalyst-Team, have started a Telegram bot for identifying deviations in crops growth in vertical farms. “Data preparation is an important stage of every research and development in deep learning.”, mentioned Mr. Sergey Kolesnikov from Catalyst-Team, in this Horti Daily news article.

Greenbird, the IT & Services company, is joining forces with a new partner, Enersis. Greenbird’s data integration capabilities, combined with Enersis’ Digital Twin and Data Analytics expertise will work together to accelerate digital transformation in the energy sector. “Data is fueling the energy transition. But it has to be good data. Real time data. Accurate, complete and actual data.“, says Thorsten Heller, CEO of Greenbird.

BackboneAI, a startup providing a data automation platform for enterprises, raises $4.7 million to unify disparate enterprise data sets with AI. CEO Rob Bailey says the round will be used to scale the company’s product for data collection within and among organizations, which he believes could improve data science team productivity by leveraging AI to integrate data from a range of sources.

A Brief History of Data Quality. Interesting read in Dataversity — the producer of educational resources for business and Information Technology (IT) professionals on the uses and management of data.

This video shows data preparation and cleaning of a coronavirus dataset using Python.

Thank you for taking the time to read our weekly blog posts on data preparation.

--

--