Sam SucikinRasa BlogLearn how to use pruning to speed up BERTLet’s compress BERT by removing its weight connections and neurons in TensorFlow. We make BERT smaller, faster, and get insights into its…Sep 5, 2019Sep 5, 2019
Sam SucikinRasa BlogLearn how to make BERT smaller and fasterLet’s look at compression methods for neural networks, such as quantization and pruning. Then, we apply one to BERT using TensorFlow Lite.Aug 8, 2019Aug 8, 2019