AutoDistill: An End-to-End Fully Automated Distillation Framework for Hardware-Efficient Large-Scale NLP Models

As AI-powered language models continue increasing in size, reducing serving cost has become an important research area. Knowledge distillation has emerged as a promising and effective method for model compression, but existing distillation methods can struggle with model-serving in today’s massive…




