Keeping up With Data — Week 8 Reading List
5 minutes for 5 hours’ worth of reading
I’ve been asked to provide feedback to two of my friends preparing for presentation this week. I’m always grateful when that happens — I take it as a compliment — and I try to provide my undivided attention to give honest and constructive feedback. Both of them are highly knowledgeable in their fields and so have plenty to talk about. In such situations, it’s often hard to self-edit and pull all the threads into a focused story line. But it’s so important that listeners can easily follow. I know from my own experience when trying to share valuable insights about data science in 60–90 seconds videos. But it’s this ability that typically takes a domain expert from good to great. I’m very much looking forward to the final version of their talks.
- Visualizing Data Timeliness at Airbnb: Data is used in decision-making processes as well as in automated operations. Therefore, it needs to be available on time. Now many companies are implementing SLAs on individual processing steps to ensure that everything runs smoothly. But what if it doesn’t? Are any datasets late? Why is a given single dataset late? Did it start late, or run too long? Airbnb engineers built a monitoring system to answer these question for the complex data flows of this data-heavy organisation. (Chris C Williams @ Medium)
- Why Is It So Hard to Become a Data-Driven Company? Companies need to be data driven in order to succeed. So many companies, big and small are investing heavily in data and AI initiatives. However, there seems to be a decline in leading metrics of success of data and AI. How to prevent that? The article provides three recommendations: (1) focus data initiatives on clearly identified high-impact business problems; (2) think about data as a business asset along the whole value chain; and (3) data-driven transformation requires culture shift and therefore takes time. (HBR)
- The Growing Importance of Metadata Management Systems: Data infrastructure is often treated as a collection of disposable components. It has many advantages, but it also requires a need to collect, unify, prepare, and manage data from diverse sources and formats. That’s why metadata and governance stack have seen a lot of new solutions lately. The future is likely to belong to end-to-end data governance solutions that cover source systems, data warehouses, data management systems, and data pipelines. (Gradient Flow)
When reading the HBR article summarising the Big Data and AI Executive Survey 2021 I was shocked to read that despite 65% of companies having a Chief Data Officer and 76% of the respondents being CDOs or Chief Analytics Officers, only 30% reported having developed a well-articulated data strategy. I find it strange being responsible for data and not have a data strategy. Imagine a CEO without a business strategy!