💡Embedding learning for DLRM

What is embedding learning ?

Published in

Better ML

1 min readJan 27, 2022

Embedding learning is a technique used in deep recommendation models (DLRMs) to map categorical features to dense vectors.
DLRM often require an extremely large embedding tables, which in turn require high memory capacity and memory bandwidth. To tackle this problem, distributed training solutions have been adopted to partition the embedding tables into multiple devices.
However, the embedding tables can easily lead to imbalances if not carefully partitioned due to their diverse characteristics, leading to poor training efficiency.

Embedding memory footprints are in O(TB) and account for over 99% of the total model capacity. Embedding memory capacity and bandwidth demands for DLRMs have been growing super-linearly, exceeding the memory capacities available on typical hardware reserved for training.
Embeddings work as hash tables for DLRM. As a result, they display interesting access patterns such as birthday paradox.
The challenge is to devise a sharding strategy for a set of embeddings based on model training data distributions and underlying memory characteristics.
An embedding planner optimizes embedding placement on devices to optimize for training time (QPS) and load balance for embedding.

RecShard: Statistical Feature-Based Memory Optimization for Industry-Scale Neural Recommendation