T5 : Machine Learning Model to Generate Text From Text

David Cochard
axinc-ai
Published in
3 min readDec 13, 2023

This is an introduction to「T5」, a machine learning model that can be used with ailia SDK. You can easily use this model to create AI applications using ailia SDK as well as many other ready-to-use ailia MODELS.

Overview

T5 stands for Text-to-Text Transfer Transformer, a machine learning model that generates text from text release by Google in October 2019.

While the traditional BERT model predicts a single masked token from a sequence of tokens, T5 outputs text that includes multiple masked words from the token sequence. This enables fine-tuning for text generation tasks such as summarization.

Architecture

T5 considers natural language processing to be a text-to-text task, taking text as input and generating text as output, inspired by other similar tasks such as Question Answering, Language Modeling, and Span Extraction.

T5 overview (Source: https://arxiv.org/pdf/1910.10683.pdf)

During training, specific words are masked, and the text that predicts these masked words is used as the training data.

T5 training (Source: https://arxiv.org/pdf/1910.10683.pdf)

T5 has a general transformer encoder and decoder structure.

T5 structure (Source: http://jalammar.github.io/illustrated-transformer/)

T5 tokenizer

The tokenizer for T5 uses SentencePiece, which encode using SentencePieceProcessor after splitting the text with SpecialTokens.

T5 accuracy evaluation

For the accuracy evaluation of the T5 summarization model, ROUGE (Recall-Oriented Understudy for Gisting Evaluation) can be used. ROUGE is an accuracy evaluation based on Precision and Recall, calculating how much of the generated words are included in the training data as Precision, and how much of the generated summary is included in the training data as Recall, then computing the F1 score. The concept is similar to the mAP (mean Average Precision) used in Object Detection.

Examples of T5 execution

T5 can be used for many text-generation tasks. Let’s for example run it on the text of the medium article you’re currently reading up to this point and ask T5 to generate a title for it.

T5 — A Machine Learning Model for AI Applications

Pretty straightforward, now let’s ask for a summary of the article.

T5 is a machine learning model that can be used with ailia SDK to create AI applications. T5 considers natural language processing to be a text-to-text task, taking text as input and generating text as output. The tokenizer for T5 uses SentencePiece. For the accuracy evaluation of the T5 summarization model, ROUGE is used.

T5 with Japanese language support

A well-known T5 that supports Japanese is sonoisa/t5-base-japanese, which base model is available on Hugging Face.

In May 2023, small, base, large, and xl models have also been released.

It is said that training of T5 on TPU pod v3–128 takes about 3 to 4 days and costs approximately 10,000$ in server expenses.

Use T5 in ailia SDK

T5 models available in ailia SDK are fine tuned version for Japanese language. You can for example run the title generation model similarly to what we did earlier on an input file input.txt, by running the following command.

$ python3 t5_base_japanese_title_generation.py -i input.txt

ax Inc. has developed ailia SDK, which enables cross-platform, GPU-based rapid inference.

ax Inc. provides a wide range of services from consulting and model creation, to the development of AI-based applications and SDKs. Feel free to contact us for any inquiry.

--

--

David Cochard
axinc-ai

Engineer with 10+ years in game engines & multiplayer backend development. Now focused on machine learning, computer vision, graphics and AR