2016 Spark Summit West

Joshua David Lickteig
{“Below”: “The_bit”}
1 min readJun 9, 2016

This year’s Spark summit west was held in San Francisco, near Union Square and the Financial District, earlier this week. The Apache Spark project (Berkley, AMPLab) is a distributed compute and analysis paradigm two years ago initially released.

The summit’s ballroom sessions expressed a topical consensus of rapidly emergent themes in data science and data development / data processing infrastructure:

  • Focus on multimodal and multimodel systems
  • Scalable deep learning
  • Graph, metadata, & topology inference
  • Convergence of systems of record and systems of insights
  • Complex event processing (CEP)

Talks by key engineers relayed lessons toward developing simultaneity within solutions and equipoise in achieving speed versus structure.

Amongst varied thinking shared on patterns for design of distributed, low-latency data stores, other frames were set on the concept of “continuous applications.”

Strategies around multi-format and SQL-like operations for aggregating data, as well as testable approaches for dynamic partitioning of skewed data have matured.

The next Spark summit is in Brussels in October.

Spark 2.0 is in preview release. Also, the book ‘Spark GraphX in Action’ will be published this month by Manning.

--

--