Introducing ThirdAI’s Universal Deep Transformers Toolkit

A General-Purpose, CPU-Based AutoML Interface for a Wide Range of Machine Learning Problems

Vihan Lakshman
ThirdAI Blog
4 min readDec 20, 2022

--

As we introduced in our previous post, ThirdAI is a startup dedicated to democratizing artificial intelligence by enabling all developers to train and deploy large-scale neural networks on commodity CPU hardware. Through algorithmic and software innovations, we aim to reduce the cost of developing state-of-the-art machine learning solutions by orders of magnitude.

In this post, we are excited to introduce ThirdAI’s latest product offering: Universal Deep Transformers (UDT), an AutoML interface for tackling a wide variety of machine learning tasks with only a few lines of Python code and no manual hyper-parameter tuning. With UDT, any developer or business organization can get up-and-running with cutting edge deep learning capabilities on a standard CPU workstation without the need for machine learning expertise.

In the remainder of this post, we will summarize the benefits of UDT and walk through several use cases, which are also available for you to try out as Google Colab notebooks.

UDT Capabilities

  1. Universal Interface: UDT can tackle a broad range of machine learning tasks and data modalities, from natural language processing, to search, to recommendations, to tabular data analytics, to time series, to text reformulation, and more all through the same API. This universal API simplifies the workflow especially as customers look to apply UDT to multiple business problems.
  2. Automated Parameter Tuning: ThirdAI’s proprietary algorithms for neural network training and inference are based on years of research in applying hashing and probabilistic data structures towards machine learning. These tools, while effective, involve a lot of custom hyper-parameters that require considerable domain expertise to tune correctly for a given dataset and workload. However, we have eliminated this bottleneck by developing new mathematical techniques for automatically selecting the optimal hyper-parameters associated with our algorithms. As a customer, you only need to invoke the UDT interface on your dataset and then sit back and relax knowing ThirdAI’s technology will select the optimal hyper-parameters and perform feature engineering automatically without any additional computational overhead.
  3. Billion-Scale Training: Thanks to the ThirdAI’s software engineering innovations as well as the larger memory on CPUs compared to GPUs, UDT can seamlessly scale to datasets with billions of entries and extreme classification tasks with hundreds of million of output labels.
  4. Sub-Millisecond Inference Latency: By utilizing ThirdAI’s proprietary sparse inference algorithms, UDT can achieve inference latencies under 1 millisecond on standard CPUs regardless of the overall model size.
  5. Immediately Production-Ready: We designed UDT with production deployment at the forefront. After training a UDT model, customers can immediately save the network in a serialized format that can be loaded in a variety of runtime environments with no additional engineering effort.

UDT Case Studies

In this section, we highlight several practical use cases for UDT. All of these examples are also available as Google Colab notebooks. We encourage you to run these notebooks and see UDT in action for yourself!

Census Income Prediction

Our first example involves a classification task on tabular data. In particular, the objective of the Kaggle Census Income challenge is to predict if a given individual’s income exceeds $50,000 annually or not. Once we have the tabular dataset downloaded, we can define our UDT model as follows:

from thirdai import bolt
from thirdai.demos import download_census_income

train_filename, test_filename, inference_batch = download_census_income()

model = bolt.UniversalDeepTransformer(
data_types={
"age": bolt.types.numerical(range=(17, 90)),
"workclass": bolt.types.categorical(),
"fnlwgt": bolt.types.numerical(range=(12285, 1484705)),
"education": bolt.types.categorical(),
"education-num": bolt.types.categorical(),
"marital-status": bolt.types.categorical(),
"occupation": bolt.types.categorical(),
"relationship": bolt.types.categorical(),
"race": bolt.types.categorical(),
"sex": bolt.types.categorical(),
"capital-gain": bolt.types.numerical(range=(0, 99999)),
"capital-loss": bolt.types.numerical(range=(0, 4356)),
"hours-per-week": bolt.types.numerical(range=(1, 99)),
"native-country": bolt.types.categorical(),
"label": bolt.types.categorical(),
},
target="label",
n_target_classes=2,
)

And then we can train, evaluate, save, and load the model in just four lines of code. Note that aside from the learning rate and number of epochs, all other training parameters are automatically selected.

# Training the model
model.train(train_filename, epochs=5, learning_rate=0.01, metrics=["categorical_accuracy"])

# Evaluating the model
model.evaluate(test_filename, metrics=["categorical_accuracy"]);

# Saving
model.save("income_prediction.model")

# Loading
model = bolt.UniversalDeepTransformer.load("income_prediction.model")

Intent Classification

For our second example, we will turn our attention to the natural language processing task of intent classification, which involves predicting an intent from a fixed set of possible labels for a given text input. For this demonstration, we use the Clinc150 benchmark dataset. As shown below, the interface is the same as the previous tabular example despite the significant differences in the data format.

from thirdai import bolt
from thirdai.demos import download_clinc_dataset

train_filename, test_filename, inference_batch = download_clinc_dataset()

model = bolt.UniversalDeepTransformer(
data_types={
"text": bolt.types.text(),
"category": bolt.types.categorical(),
},
target="category",
n_target_classes=150,
)

model.train(train_filename, epochs=5, learning_rate=0.01, metrics=["categorical_accuracy"])

model.evaluate(test_filename, metrics=["categorical_accuracy"]);

save_location = "intent_classification.model"

# Saving
model.save(save_location)

# Loading
model = bolt.UniversalDeepTransformer.load(save_location)

Query Reformulation

For our final example, we will look at the problem of query reformulation, which is a critical task for many search engines. In short, query reformulation involves rewriting a never-before-seen user query — such as one with typos or non-standard descriptions — into a canonical version for which the search engine has already cached high quality results. Although this task appears very different from the previous supervised learning problems we have considered in this post, the UDT interface remains almost identical

from thirdai import bolt
from thirdai.demos import prepare_query_reformulation_data

import pandas

train_filename, test_filename, inference_batch = prepare_query_reformulation_data()


model = bolt.UniversalDeepTransformer(
source_column="source_queries", target_column="target_queries", dataset_size="medium"
)

model.train(filename=train_filename)

query_reformulations = model.evaluate(filename=test_filename, top_k=5)

model_location = "query_reformulation.model"

# Saving
model.save(filename=model_location)

# Loading
model = bolt.UniversalDeepTransformer.load(model_location)

In future posts, we will dive deeper into more of the details behind these various applications. For now, the key takeaway is that we can solve a diverse array of technical challenges with the same UDT interface.

Conclusion

In this post, we introduced ThirdAI’s latest product offering: Universal Deep Transformers (UDT). With UDT, customers have access to a state-of-the-art AutoML interface for machine learning that operates efficiently on CPUs without the need for tedious manual parameter tuning or domain expertise in deep learning. We encourage you to try out UDT for yourself through our demo notebooks and visit our website. To use UDT for your business needs, please reach out to us by requesting a trial license for our software.

--

--