Imhotep: Large Scale Analytics and Machine Learning at Indeed

Indeed Engineering
Indeed Engineering
Published in
1 min readApr 3, 2014

This talk was held on Wednesday, March 26, 2014.

To scale the building of decision trees on large amounts of Indeed job search data, we created a system called Imhotep. In addition to being a crucial tool for building these machine learning models, Imhotep has proven to be applicable to many different analytics problems. The core of Imhotep is a distributed system that manages the parallel execution of queries across a set of time-sharded inverted indices.

This talk covers Imhotep’s primitive operations that allow us to build decision trees, drill into data, build graphs, and even execute SQL-like queries in IQL (Imhotep Query Language). We discuss what makes Imhotep fast, highly available, and fault tolerant.

Audio Description

The following video includes a descriptive audio track for this talk.

Transcripts

Speaker

Jeff Plaisance is a senior software engineer at Indeed.

Originally published at Indeed Engineering Blog.

--

--