Our Investment in Hugging Face
Fostering the largest open source community in Natural Language Processing
A seminal moment in machine learning took place on Sept 30, 2012 when a convolutional neural network called AlexNet achieved groundbreaking results in the ImageNet competition. This kicked off a race of rapidly improving computer vision models to the point where the technology outperformed humans in many tasks. These breakthroughs accelerated industries such as autonomous vehicles, consumer mobile applications, and created new multi billion dollar opportunities around computing architectures for machine learning training and inference.
Natural Language Processing (NLP), another discipline of machine learning has seemed to lag behind in progress relative to computer vision. Recently NLP may have had its “ImageNet” moment due to new transformers models (e.g., GPT2 and BERT) shattering performance benchmarks. Weeks ago, Google announced their biggest update in years to search with the implementation of BERT neural networks to improve results. The market for bringing NLP to enterprises and consumers is incredibly exciting — very few professions deal with images (and consequently computer vision) but nearly all jobs work with text and language (and thus NLP).
Hugging Face was founded in 2016 with offices in New York and Paris. Just over a year ago they created Transformers, which quickly became the most popular open-source library for developers and scientists to build state-of-the-art natural language processing technologies. More than 1,000 companies are using Hugging Face’s technology in production across applications such as text classification, information extraction, summarization, sentiment analysis, text generation and conversational artificial intelligence. The models and API’s are used by startups to some of the largest tech companies with open source contributors spanning researchers from Google, Microsoft, Facebook and many more.
The ethos of the AI community has always been similar to the open source community but we have yet to see a startup at this intersection really stand out until we talked to the Hugging Face team. They’ve won the open source community with over 1 million installs, ~19,000 Github stars, and many use cases in production. Hugging Face is one of the fastest-growing open-source projects we have ever seen and are thrilled to lead their $15M Series A round where I will be joining their board. We are excited to partner alongside other co-investors such as A.Capital, Betaworks, Richard Socher (chief scientist at Salesforce), Greg Brockman (co-founder & CTO OpenAI), Kevin Durant and many other great angels.
Models are changing quickly, becoming more complicated, and an increasing burden to manage and deploy. Hugging Face provides production ready pre-trained Transformer models, saving days to weeks of time and hundreds of thousands of dollars in compute resources; a programming interface is provided in both pytorch and tensorflow to fine tune models for different applications in production (e.g., text classification & generation). Today there are 30+ pre-trained models in a 100 different languages and growing by the week.
The industry will continue to evolve and we believe Hugging Face will become the de facto technology and API that every developer will use for understanding and generating natural language. For anyone interested in working at the intersection of NLP and open source, the team is hiring across all roles in New York and Paris!