Open Sourcing Transformer Embeddings

Photo by Jr Korpa on Unsplash

You should use this if you want to…

  • Automatically apply tokenization (with the model defaults) before your model’s forward pass.
  • Stack outputs from the model into a single, iterable array that map 1:1 with your input.
  • Simplify interactions with any transformer model available on the HuggingFace Model Hub for exploration and inference.
  • Easily apply and compare the impact of different pooling strategies (mean, max, min, pooler) on your downstream tasks.
  • Use your model on CPUs or GPUs, without worrying about if you asked PyTorch to use the right device.
  • Export the model and additional artifacts (custom scikit-learn / tree-based models, model cards, etc.) to S3.
  • Customize batch sizes for different models as you play with them.

You should not use this if you want to…

  • Fine-tune the underlying embedding models or train new models. (We recommend HF transformers or sentence-transformers as alternatives.)
  • Use TensorFlow / JAX for your deep learning models.

Questions or suggestions?

Email us at



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

Headspace is meditation made simple. Learn with our app or online, when you want, wherever you are, in just 10 minutes a day.