Introducing Texar-PyTorch: An ML Library Integrating the Best of TensorFlow into PyTorch

  • Data: Best practice of for easy data processing, batching, and iteration, all efficient based on buffered shuffling, caching, and lazy-loading. We also replicate TFRecord to ingest arbitrary complex data types and large files.
  • Modeling: Abundant functions and excellent modularization of ML models, such as the principled design of sequence models including text generation decoders, attention mechanisms, and RNNs, etc.
  • Training: We replicate high-level APIs of TF Estimator and keras.Model but with much greater flexibility, for turnkey model training, evaluation, prediction, TensorBoard visualization, and seamless combination with external hyperparameter tuning tools.

What Texar-PyTorch Provides

  • State-of-the-Art Model Building Blocks — building an ML model is like assembling Lego bricks. Plugging-in and swapping-out modules as you like. Read more
  • Easy and Efficient Data Processing — rich built-in processors for common types of datasets. Simple-but-powerful interfaces for arbitrary custom Best practice integrated, no worry about efficiency. Read more
  • Turnkey and Flexible Model Training with Executors — Getting free of boilerplate code for training and evaluation loops, while still highly flexible to customize for your specialized need. Read more
Code Example 1: Building and training a conditional GPT-2 model (e.g., for text summarization) with Texar-PyTorch.

Why Choose Texar?

  • Supports both TensorFlow & PyTorch. Sometimes it’s not your choice of which underlying framework to use, and learning a new higher-level framework is probably just as time-consuming as writing the parts yourself. Now with Texar, you can use the same interfaces with minimal changes in both frameworks. The two versions can even share pre-trained model weights that you’ve downloaded.
  • Provides Natural Language Processing, All in One Kit. Texar has a comprehensive coverage of neural models on natural language processing tasks, especially text generation. Figure 1 gives a snapshot of Texar modules. With Texar, not only will you have access to a complete range of state-of-the-art pre-trained models, but you’ll also find all the utilities you need, from data processing to modeling to training and evaluation. We’ve got you covered.
  • Facilitates Novice- and Expert-Friendly. Whether you’ve just picked up deep learning, or you’re an experienced researcher, you’ll find Texar easy to use. Texar provides state-of-the-art built-in components but remains flexible enough for customizations.
Figure 1: Texar provides a comprehensive set of modules for data processing, model architectures, loss functions, training, evaluation, as well as a range of state-of-the-art pre-trained ML/NLP models (e.g., BERT, GPT-2, etc).


  • It’s straightforward to invoke a common inference method, e.g., teacher-forcing decoding, by simply setting the decoder argument `decoding_strategy=’train_greedy’`.
  • OTOH, to perform advanced inference, e.g., Gumbel softmax decoding for adversarial learning, users can use a GumbelSoftmaxHelper. Expert users can further define new Helpers to customize whatever decoding strategies.
Code Example 2: Building a pre-trained GPT-2 language model, fine-tuning with maximum-likelihood learning and adversarial learning (using BERT as the discriminator).
  • Excellent modularization — switching between different learning contexts is enabled by simply plugging in/swapping out a couple of modules.
  • Multi-level interfaces — high-level intuitive interfaces for novice users and low-level highly-customizable ones for expert users.
  • Built-in state-of-the-art pre-trained models — BERT, GPT-2, RoBERTa, XLNet and more, for tasks of text encoding, classification, sequence tagging, and generation.


  • Decoupling single instance processing and batching — for clearer program logic and easier customization
  • Buffer-based shuffling, caching, and lazy-loading — for greater efficiency
  • Extensive dataset iterators — no extra user configuration needed
  • More intuitive APIs — no expertise needed to get the best practices in your project
Figure 2: Texar-Pytorch built-in datasets for a majority of ML and NLP tasks.
Code Example 3: Loading complex image captioning data with Texar-Pytorch RecordData.
Code Example 4: A customized dataset that performs BPE tokenization for input text.


  • Print logs every `logging_steps` iteration to the console, a log file, and Tensorboard.
  • Perform validation every `validate_steps` iteration, by evaluating the model output with the BLEU metric.
  • If validation results improve, save the current checkpoint. If results failed to improve for `patience` consecutive trials, load the previous checkpoint, and scale the learning rate.
Code Example 5: A typical hand-written train-eval loop.
Code Example 6: The same train-eval loop with Executor.
  • Q: What if we also want to do validation after each epoch?
    A: Simply change `validate_every` to:
  • Q: What if we want to perform early stopping after we’ve scaled the learning rate `early_stop_patience` times?
    A: Simply change `action_on_plateau` to:
  • Q: What if we also want to measure the word-level loss?
    A: Simply add a new metric to `valid_metrics`:
  • Q: What if we want to do hyperparameter tuning and train the model multiple times?
    A: Simply create an Executor for each set of hyperparameters that you want to test. Since Executor takes care of everything besides model creation, you don’t need to worry about consuming extra memory or accidentally retaining objects from previous runs. Here’s an example of using Executor with hyperopt.
  • Q: What if, at the end of each epoch, we want to upload the current checkpoint to the server, send an email containing the training progress, and take the dog out for a walk?
    A: Weird, but okay. Simply register a custom action on a condition of your choice, and do whatever you wish:

Switching from Texar-TF to Texar-PyTorch

Getting Started

  • Documentation: We have detailed documentation for every module and function.
  • Examples: We strongly encourage you to check out our examples to get a basic idea of how Texar is used in practice. The examples are clearly documented and cover rich use cases.
  • ASYML Library: Find quick links to all Texar resources in one place.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


Toolkit for Text Generation and Beyond