This Week in Data Preparation (September 7, 2020)

Nikolaos Konstantinou
The Data Value Factory
5 min readSep 7, 2020

This weekly post with news items from the data preparation market is brought to you by The Data Value Factory, the company offering Data Preparer.

13 links in this week’s post: 4 articles (on modern data teams, DataOps, data analytics, and customer data, by Matillion, Objectivity, Exasol, and Data Ladder), 2 interviews (on explainable AI and creating data-driven applications — by Arize AI, Anaconda, and Intel), 1 tutorial on data prep for Machine Learning by Microsoft Research, 4 company updates (by Sisu, Rapid Insight , Siemens’ Mendix, and Zoopla), 1 partnership announcement (by Boomi and Solace), and 1 capital raise announcement (by Headset).

This week in data preparation — A weekly post by The Data Value Factory, with news items from the data preparation market.
The Data Value Factory — This Week in Data Preparation. September 2020 Image by Tumisu from Pixabay.

What the Modern Data Team Looks Like and Where It’s Headed. “As business and IT data tasks increasingly overlap, the ultimate goal of the data team will be to create a strategic vision for data and provide the self-service access to insights that help the company achieve that vision.” explains Ed Thompson, CTO and co-founder of Matillion.

A guide to DataOps: Enabling the true speed of your data. “Data can be one of your most precious assets, therefore, it should always be processed and presented in line with the highest standards of quality. Applying a DataOps approach to your processes, tools, and employee competences can help you to achieve those standards. Moreover, with DataOps, you’ll gain access to crucial information quickly and efficiently, allowing you to use your data right when you need it, and before it becomes outdated.” says Julia Orłowska, BI Practice Leader with a strong technical background in Objectivity’s Data & AI department.

Going Beyond Data-Driven: The Three Pillars of Data Analytics. “The push for digital transformation is nothing new. Yet the accelerated adoption of digital transformation inspired by the coronavirus pandemic is unlike anything we have ever seen before. Companies have been forced to quickly ramp up their digital strategies in order to survive in our new world of virtual business. Those that could not quickly pivot and reset their business strategy did not survive.” says Eva Murray, Technology Evangelist for Exasol.

Do You Have Customer Data You Can Trust? “To stay ahead of the curve, to find hidden opportunities, to deliver on digital transformation initiatives, to hyper-personalize, to create unique experiences, you need accurate, complete, timely, unique, customer data.” says Farah Kim, Product Marketing Specialist at Data Ladder.

Arize AI Helps Us Understand How AI Works. The lack of explainability and observability prevents AI and ML from being trusted tools in critical areas of human activity, such as getting approved for a loan or undergoing procedures to diagnose a disease. Fortunately, Jason Lopatecki and Aparna Dhinakaran understand the importance of explainability and observability of AI and ML in their prior professional experiences, creating Arize AI as the tool they never had.

How to Deliver on Your Data Strategy. Increasingly, Python — by virtue of its ease of use and powerful automation capabilities — is being used to speed the creation and deployment of data-driven applications. However, a limiting factor is that such open-source tools may not meet the performance, security, and replicability demands of production business environments. To sort through these and other issues, RTInsights sat down with Stanley Seibert, Director, Community Innovation, at Anaconda; and Heidi Pan, Director, Data Analytics Software, at Intel, and discussed the challenges data scientists face, why they’re looking to Python for help, and the need for enterprise-class features and support.

Data Prep for Machine Learning: Splitting. Dr. James McCaffrey of Microsoft Research explains how to programmatically split a file of data into a training file and a test file, for use in a machine learning neural network for scenarios like predicting voting behavior from a file containing data about people such as sex, age, income and so on.

Sisu Adds New Tools to Augment Data Analytics Workflows. Sisu Data has announced two new ways to augment data preparation: a shared query repository and an Athena connector for Amazon S3 data. This product expansion is part of Sisu’s focus on augmenting every part of the analytic workflow. The new capabilities were announced in a Sisu blog post by Davide Russo, product manager of Sisu.

Rapid Insight Launches Free “Classroom Edition” of its Analytics Software. “The Construct and Predict tools allow my students to gain hands-on experience in ETL, Data Mining and Predictive Analytics,” said John Whitehouse, Adjunct Instructor at Elizabethtown College. Speaking about the need for the Classroom Edition, Rapid Insight Founder and President Mike Laracy said, “Instructors tell us they need a free, easy-to-teach tool for their students to use when learning the practical side of data science. We’re happy to provide our software not just to institutions who are customers, but to any instructor who needs it.”

Siemens leverages Mendix low code to build personalized data solutions. Siemens Digital Industries Software has reportedly leveraged the Mendix low-code application development platform for helping clients to develop personalized and contextual solutions that would facilitate data-driven processes of decision making. According to Derek Roos, the CEO of Mendix, the company envisions to furnish customers a technology that would allow them to develop applications in a faster and efficient manner.

Zoopla drives KPIs with centralized data using Fivetran ELT for Amazon Redshift. “By centralizing data into the existing Amazon Redshift data warehouse, using Fivetran to automate data ingestion, and building dashboards with Power BI we’ve created a consistent and efficient analytics process. It’s saved our team time, and made sure we’re able to continue to deliver valuable insight to our stakeholders.” says Steven Collings, Senior Data Consultant at Zoopla.

Boomi and Solace partner to simplify enterprise integration modernization. Boomi, a Dell Technologies business and leading provider of cloud-based Integration-as-a-Service (iPaaS), and Solace, a leading provider of event streaming and management capabilities, has announced PubSub+ Connector for Boomi. “In this era of uncertainty, it’s essential organizations — small or large — supplement their workforce with automated digital processes that empower them to make decisions and changes overnight, if not sooner,” said David Irecki, Director of Solutions Consulting Asia-Pacific and Japan (APJ) at Boomi.

Headset Raises $3.2M to Expand Its Leading Data Platform Into New Markets. Headset, the leading provider of data and analytics to the cannabis industry, announced it has raised $3.2 million in a bridge round from existing investors, led by Canopy Rivers with participation from Poseidon Asset Management and others. “We’re grateful to our investors for their continued confidence in our vision,” said Cy Scott, Founder and CEO of Headset. “Headset has been a true innovator in this space, working diligently to provide customers with a platform that is essential to staying competitive,” said Narbé Alexandrian, President and CEO at Canopy Rivers.

The Data Value Factory. A week’s worth of manual data preparation in minutes.
A week’s worth of manual data preparation in minutes.

Thank you for reading our weekly post with news items from the data preparation market. Have you tried Data Preparer?

--

--