My Research at Tufts

Dan Pechi
dan-pechi
Published in
3 min readAug 15, 2018

I’m particularly interested in the intersection of psycholinguistics and multi-agent, deep learning-based NLP, as I believe the best way to advance the state of AI and overcome the challenges of language is to look to ourselves for inspiration. Deep Learning opens our eyes to what goes on inside of us by reverse-engineering phenomena like vision and language, so it’s intuitive to raise language generators and translators in ‘natural’ conditions.

* Meta-Turing Test

This project grew out of a project for my NLP class in Fall 2017 I worked on with Joe Howarth. The user interface can be found here. We initially aimed to create a traditional Turing test using a long-short term memory recurrent neural network to generate language. We thought it would be fun to make the project more meta by then having an AI judge, alongside the human judge, try to distinguish human-generated text from AI-generated text. We hoped that our generator would be good enough to trick users into thinking it was human, while building a discriminator capable of making the correct determination of human and AI-generated text. This would suggest that there were dimensions of human language the AI discriminator was able to identify that humans could not recognize.

The project we turned in didn’t have the generator’s loss function take into account the discriminator’s determinations of human-ness like in a Generative Adversarial Network, however, we’re currently working on modifying this component. Additionally, we’re working on updating the generator to have features like skip connections and attention to reduce perplexity and hopefully produce a better model overall. Instead of opting to produce dialogue responses as in Li et al. 2017, our work uses standard language models, the motivation being that the dialogues presented in the paper seemed far more human than language produced by most state-of-the-art language models, hence making any improvements more empirically deducible.

* Dual Inference for Universal Transformer-Based Machine Translation

This is the working title of my senior thesis. I really liked Google Brain’s Universal Transformer architecture because of its incorporation of translation theory. In addition to having constant path lengths between tokens, the model is capable of revising its translations to improve their quality, just as a translator does. The model also manages to eschew any CNN or RNN paradigm; as such, the model is not confined to hierarchical or sequential processing of language. In languages like Russian where word order doesn’t matter, processing language sequentially just doesn’t make sense. The same principle applies when considering the variations of Subject-Verb-Object order across the world’s languages. The Universal Transformer’s efficiency is also notable as traditional, RNN-based architectures can require thousands of dollars worth of GPU’s to train. The Universal Transformer requires just one. The computational efficiency of the brain suggests these simpler architectures might hold promise for actually replicating humans’ language skills.

I aim to integrate this novel architecture into a reinforcement-learning-inspired framework that integrates other language models, and a secondary, reverse translator, as proposed in Dual Learning for Machine Translation]. Essentially, this ‘socializes’ the translator by contextualizing its understanding of language in how well other language models understand its speech. Integrating this framework is absolutely critical, as humans are unable to develop natural language faculties without linguistic feedback from their social interactions.

Currently, I’m working on building a simplified version of the underlying, transformer translators in PyTorch.

--

--