Data Over Matter — Innovating the Next Generation of Data Products
Published in
1 min readFeb 9, 2017
In this talk, you’ll learn about data science at Netflix. Specifically:
- data infrastructure and tools at Netflix
- A/B testing at Netflix
- Netflix’s culture of innovation
Talk Structure:
- Evolution of Netflix
- Data at Netflix
- Data Over Matter
Meet the Speaker: Eli Bressert
“Love big ideas and putting them into action by connecting the impossible. Innovating at Netflix through all things data. Singularity University mentor. If you want to see what I’m up to check out astrobiased.com or follow me on Twitter.” — Eli
Notes:
*Note: Eli’s Slides Are Here.
- Netflix Timeline:
1999 DVD distribution => 2007 Streaming => 2012 Originals => 2016 Global - Netflix Culture Slide Deck
- Data Tools / Ecosystem:
- Python and some R
- Hive
- Spark
- Presto
- Teradata
- Sting
- Cassandra
- Aegisthus, https://github.com/Netflix/aegisthus
- Druid
- Kafka
- Hive
- Amazon S3 - Experimentation, not opinion
- Evolution of user experience in the UI (design, image optimization,
- A/B testing at Netflix:
- Members (discovery algorithms, design/ui, product functionality, messaging, streaming quality)
- Non-members (sign-up flow, messaging, marketing) - Architecture Overview:
- ABlaze (evaluating allocations for AB tests) => Allocations Services => Kafka & Hadoop => Presto & Druid => Ignite - Batch Allocation:
- Analyst manually creates table within list of users ahead of time
- A?B system simply allocates from this pool of users - Real Time Allocation:
- User participation criteria are defined in UI
- Allocate on the fly if a user meets those criteria - Maximizing Value of Data for Product
- Follow: @netflixData