People following Natural Language Processing (NLP) research closely will know that latterly introduced language models like BERT [7], GPT are radically changing the NLP field. In NLP, unsupervised pre-training of language models improved many tasks such as text classification, semantic textual similarity, machine translation. Have you ever wondered how these…