Released ailia Tokenizer 1.3

David Cochard
axinc-ai
Published in
3 min readAug 19, 2024

We have released ailia Tokenizer 1.3, which enables mutual conversion between text and tokens. We have also introduced a new Python API and applied it to ailia MODELS.

Overview

ailia Tokenizer is a library that converts text to tokens and vice versa. When performing natural language processing with AI, it’s necessary to use a tokenizer to convert the text into tokens that the AI can process. Traditionally, this task was handled by Transformers, but since Transformers only offer a Python API, it was challenging to use them from C++, Unity, or Flutter. ailia Tokenizer addresses this issue by providing a tokenizer available across multiple platforms.

New features

Addition of new tokenizers

We have added support for new tokenizers, including GPT2 and LLAMA. GPT2 is utilized in GPT2, MSCLAP, and BLIP2, while LLAMA is used in llava.

Performance Optimization

We have optimized the BPE logic for Whisper and Clip, resulting in a significant gain in processing speed.

Python API support

A Transformers-compatible API has been added, allowing it to be called directly from Python. Since Transformers use TensorFlow and Torch as backends, they present the following challenges:

  • Loading the libraries takes considerable time.
  • When used in Docker, the image size becomes large.
  • The cuDNN version used by Torch may conflict with the cuDNN versions used by ailia SDK or ONNX Runtime.
  • Changes in the Transformers API specifications can cause models to stop working, even with the same arguments.

ailia Tokenizer resolves these issues by providing a stable tokenizer with minimal dependencies.

Usage with ailia MODELS

With the provision of the ailia Tokenizer Python API, all models in ailia MODELS now use ailia Tokenizer instead of Transformers.

Out of the 336 models in ailia MODELS at the time of writing, the following 39 models utilize ailia Tokenizer.

audio_processing/clap
audio_processing/distil-whisper
audio_processing/msclap
audio_processing/kotoba-whisper
diffusion/latent-diffusion-txt2img
diffusion/stable-diffusion-txt2img
diffusion/control_net
diffusion/riffusion
diffusion/marigold
image_captioning/blip2
image_classification/japanese-stable-clip-vit-l-16
image_classification/japanese-clip
large_language_model/llava
natural_language_processing/bert
natural_language_processing/bert_insert_punctuation
natural_language_processing/bert_maskedlm
natural_language_processing/bert_ner
natural_language_processing/bert_sentiment_analysis
natural_language_processing/bert_tweet_sentiment
natural_language_processing/bertjsc
natural_language_processing/cross_encoder_mmarco
natural_language_processing/fugumt-en-ja
natural_language_processing/fugumt-ja-en
natural_language_processing/multilingual-e5
natural_language_processing/sentence_transformers_japanese
natural_language_processing/t5_base_japanese_title_generation
natural_language_processing/bert_sum_ext
natural_language_processing/bert_zero_shot_classification
natural_language_processing/t5_base_japanese_summarization
natural_language_processing/t5_whisper_medical
natural_language_processing/gpt2
natural_language_processing/rinna
natural_language_processing/bert_question_answering
natural_language_processing/glucose
natural_language_processing/bert_maskedlm_proofreeding
natural_language_processing/soundchoice-g2p
network_intrucation_detection/bert-network-packet-flow-header-payload
network_intrucation_detection/falcon-adapter-network-packet
object_detection/glip

You can still use Transformers as before by using the options below.

--disable_ailia_tokenizer

More info on ailia Tokenizer

For more information on ailia Tokenizer, please refer to the article below.

ax Inc. has developed ailia SDK, which enables cross-platform, GPU-based rapid inference.

ax Inc. provides a wide range of services from consulting and model creation, to the development of AI-based applications and SDKs. Feel free to contact us for any inquiry.

--

--

David Cochard
axinc-ai

Engineer with 10+ years in game engines & multiplayer backend development. Now focused on machine learning, computer vision, graphics and AR