Big data: 8 ideas to watch

A look at the major forces shaping the data world.

O'Reilly Radar
Nov 4, 2014 · 6 min read

Cognitive augmentation

The combination of big data, algorithms, and efficient user interfaces can be seen in consumer applications such as Waze or Google Now. Our interest in this topic stems from the many tools that democratize analytics and, in the process, empower domain experts and business analysts. In particular, novel visual interfaces are opening up new data sources and data types.

  • Palantir and Quid use a combination of visualization, search, and analytics that enable domain experts to discover patterns hidden in large data sets.
  • StitchFix provides product recommendations by combining proprietary algorithms and expert stylists.
  • “Moving dots” (e.g. tracking data from athletics) are being analyzed by companies that specialize in spatio-temporal pattern recognition. Startup Second Spectrum provides analytics to coaches and front offices in many professional basketball teams. In the near future, their technology and recommendations will be available in real time to coaching staffs during in-game situations.

Intelligence matters: Artificial intelligence and algorithms

Bring up the topic of algorithms, and a discussion on recent developments in artificial intelligence (AI) is sure to follow. AI is the subject of an ongoing series of posts on O’Reilly Radar. The “unreasonable effectiveness of data” notwithstanding, algorithms remain an important area of innovation. We’re excited about the broadening adoption of algorithms like deep learning, and topics like feature engineering, gradient boosting, and active learning. As intelligent systems become common, security and privacy become critical. We’re interested in efforts to make machine learning secure in adversarial environments.

The convergence of cheap sensors, fast networks, and distributed computation

The Internet of Things (IoT) will require systems that can process and unlock massive amounts of event data. These systems will draw from analytic platforms developed for monitoring IT operations. Beyond data management, we’re following recent developments in streaming analytics and the analysis of large numbers of time series.

Data (science) pipelines

Analytic projects involve a series of steps that often require different tools. There are a growing number of companies and open source projects that integrate a variety of analytic tools into coherent user interfaces and packages. Many of these integrated tools enable replication, collaboration, and deployment. This remains an active area, as specialized tools rush to broaden their coverage of analytic pipelines.

Evolving, maturing marketplace of big data components

Many popular components in the big data ecosystem are open source. As such, many companies build their data infrastructure and products by assembling components like Spark, Kafka, Cassandra, and ElasticSearch, among others. Contrast that to a few years ago when many of these components weren’t ready (or didn’t exist) and companies built similar technologies from scratch. But companies are interested in applications and analytic platforms, not individual components. To that end, demand is high for data engineers and architects who are skilled in maintaining robust data flows, data storage, and assembling these components.

Data scientists, design, and social science

To be clear, data analysts have always drawn from social science (e.g., surveys, psychometrics) and design. We are, however, noticing that many more data scientists are expanding their collaborations with product designers and social scientists.

Building a data culture

“Data-driven” organizations excel at using data to improve decision-making. It all starts with instrumentation. “If you can’t measure it, you can’t fix it,” says DJ Patil, VP of product at RelateIQ. In addition, developments in distributed computing over the past decade have given rise to a group of (mostly technology) companies that excel in building data products. In many instances, data products evolve in stages (starting with a “minimum viable product”) and are built by cross-functional teams that embrace alternative analysis techniques.

  • Just Enough Math is a video series that introduces mathematical concepts using business cases.
  • Lean Analytics: Acquire a data-driven mindset through 30 case studies.
  • Data Jujitsu: A primer on organizing teams and building data products.

Perils of big data

Every few months, there seems to be an article criticizing the hype surrounding big data. Dig deeper and you find that many of the criticisms point to poor analysis and highlight issues known to experienced data analysts. Our perspective is that issues such as privacy and the cultural impact of models are much more significant.

Best of O’Reilly Radar

Emerging technology insights and analysis from O’Reilly Media.

    O'Reilly Radar

    Written by

    Insight, analysis, and research about emerging technologies from O’Reilly Media.

    Best of O’Reilly Radar

    Emerging technology insights and analysis from O’Reilly Media.