AI Diary #3 — BERT, MELD, DepecheMood++, Future of AI and NLP, Anthropomorphism in AI,…

Topics for this entry include state of the art NLP models, future of AI and NLP, anthropomorphism in AI, Deep Learning Indaba course material, and much more.

elvis
DAIR.AI
10 min readOct 19, 2018

--

BERT — State of the art model for a variety of natural language tasks

So far this year we have witnessed many breakthroughs in the field of natural language processing, starting from models that learn deep contextualized features (ELMo) to models that leverage fine-tuned language modeling techniques to efficiently perform downstream tasks such as sentiment analysis (ULMFit). More recently, a new system has been proposed, called BERT, which basically tweaks the way contextualized featured are learned in the Bi-LSTM layers of a language representation model. The results are impressive, to say the least. I even tweeted about it and also commented that the one thing I expect to see more of in future NLP models, is to address more complex natural language tasks. See the tweet below (I even got some love from Richard Socher, who retweeted my tweet):

Anthropomorphism in AI

I have been thinking lately about releasing a set of illustrations and visuals that can be useful for the AI community, especially to science communicators. Writers and content creators can use the illustrations to incorporate into their blog articles, code tutorials, educational videos, and other types of content. My hope is that we can all be a little more responsible in the way we communicate about AI technologies and avoid misleading the general public about the true impact of AI in society and what the technology involves. I strongly believe journalists, researchers, developers, and everyone involved in AI have a responsibility, so I would like to contribute by reducing the spread of misleading images such as the following illustrations:

Image source

No offence intended for the author of the illustration above, but is this really how we want to depict the concept of deep learning — a human brain embedded on top of what looks like an electronic chip? Don’t you think the concept could be depicted more accurately and carefully, not distorting what it really does and what it really is. It would help if we can begin to see more accurate imagery used to describe AI technologies, and for that, I am working on a set of guides and examples on how we can more carefully achieve this through illustrations. This will be announced later on this publication.

I hope to continue this conversation and help others become more aware of the harm this form of communication does to the field. The whole idea of attributing human characteristics (e.g., human-like abilities) to machine learning models is called anthropomorphism and it’s discussed more in details in a recent paper by Zachary Lipton entitled “Troubling Trends in Machine Learning Scholarship. I did a summarization of the paper here if you don’t want to read the entire paper.

Animated LSTMs

Recently, I found this amazing illustrated guide on how LSTMs work. I think this is the type of thing that is truly impressive about our community (a lot of people invest so much of their time to teach others about the beauty of neural networks). Check out more of Michael’s (the author of the illustrations) excellent work on illustrated guides that convert complex neural network concepts to very simple ones. Below you can see a preview of one of the beautiful animated illustrations made by Michael.

source

What’s Next in AI and NLP?

Since we are on the topic of NLP, here is William Wang (Assistant Professor at UCSB) presentation on where he believes NLP is headed. I particularly like the discussion about empathetic conversational agents. On that note, you can catch up with his recent ACL paper entitled MojiTalk, inspired by DeepMoji and a whole lot of other works related to emoji prediction and emoji-related language tasks. Together with his colleagues, Wang discusses the potential to use reinforcement learning and the latest advancements in deep learning to teach machines how to perform dialogue with empathy awareness. I personally like this paper even though it only begins to touch the surface of what’s possible with deep learning and NLP.

Here is Wang’s talk:

NLP in the distant future

In a recent interview, ACM Fellow, Professor Qiang Yang, is asked the following question: “Where do you see NLP in 5 to 10 years?” This is his response:

“I believe that natural language processing (NLP) is a more difficult AI problem to solve than its computer vision counterpart, as humans are used to expressing a vast range of meanings using a few words. Natural-language dialog will be a challenging and useful area in which to do research. If you have watched the HBO TV series “Westworld,” you will recall that scientists who built the robots believed in a theory of intelligence known as “bicameral mind” — that intelligence emerges from the dialog of two faculties in the brain. In our research, if we program a computer system to have a conversation with another system or human, it is likely to spark something different. I believe that a dialog is like playing chess: the two parties who talk to each other negotiate their thoughts through a series of sentences following a “policy,” which can be learned using a reinforcement-learning algorithm. Recently, we have worked on the problem of transferring a general dialog policy to a personalized policy that can learn an individual’s preferences”

I like his strong opinion that NLP is a difficult problem and that in order to understand it better we may have to build very complex reinforcement algorithms that are able to engage in various types of dialogue and conversational settings. Check out the rest of the interview here to learn more about what he thinks of the future of NLP and where the field is headed.

Under-resourced languages

Lately, I have been observing that a lot of papers report methods that could be useful to address problems related to languages that are under-resourced. Not only do I strongly believe that this is an important research direction in the field of machine learning and NLP, but I also believe it has merits in terms of building technologies that help social groups better communicate with each other. Talk about the impact of AI technologies in society. If you are looking for a topic to research on, you don’t have to look very far. Focus on an under-resourced language, download a pre-trained model, and try to perform some task like sentiment classification. Start small and build up.

Accelerating Data Science and ML

NVIDIA releases RAPIDS.ai to accelerate ML and Data Science with pandas-like dataframes. In general, storage technologies are optimized for storage efficiency but with RAPID.ai it is now possible to operate on dataframes on a GPU. There are so many use-cases for this technology so I had to feature it here. Go check that out!

NLP Resources

Araque et al. released a new state of the art emotion lexicon, known as DepecheMood++, for emotion recognition. I like the work the authors are putting towards this specific resource and research because they show that you can achieve remarkable results with simple yet powerful NLP techniques.

Another great emotion-related dataset released this month is known as MELD. The dataset contains multimodal, multi-party conversations with more than two speakers in a dialogue. The dataset could be used to build powerful empathy-aware conversational agents. See the diagram below to get a feel for what type of data is contained in MELD.

Principles of AI?

Interested in AI ethics? Rachel Thomas published a primer on how to get started on ethics in AI. It basically contains a list of resources, curricula, courses, etc. I am working on something similar but instead the focus in on categorizing by topics. I will announce this later on the dair.ai publication.

Eloquent.ai has also recently published a nice little blog post detailing some of the principles they follow in building their conversational AI technology. These include:

1. Evaluate AI systems on unseen data
2. More data leads to better models
3. An ounce of clean data is worth a pound of dirty data
4. Start with stupid baselines
5. AI isn’t magic

Fast.ai’s new machine learning course

If you missed it, Jeremy Howard from fast.ai is now offering a new course in applied machine learning. Be sure to check that out here.

Comparing models that learn different representations

I believe that sometimes researchers or even curious developers program a model to understand something and then compare with other models. For instance, we would build something like an LSTM-based model and then we would compare with a CNN-based model or something of that nature. Since these two models are learning very different representations, I sometimes wonder if this is the best setup to fairly compare such learning models. For now we just directly compare without any consideration of the intrinsic and natural ways these models learn. Perhaps there is some paper that talks about this and I may not be aware of it. I would appreciate it a lot if someone can lead me to such a read.

I know that there are various papers that compare different learning mechanisms for different types of tasks, and I totally get why they would do so. I would take it a step further and find better ways to compare models that tend to learn representations with similar learning mechanisms. I know that I am focusing too much on the learning mechanism and not on the actual task, but I think as responsible researchers we should pay more close attention to how we evaluate the results of our neural based models.

Explainable and explainable AI for healthcare

If we are letting computers make decisions for things related to healthcare, we better understand how these algorithms make these decisions. In this webinar, several experts explain how to choose the right interpretable machine learning algorithm for problems related to healthcare. They also teach how to properly deploy these type of machine learning models. Even if you are not interested in the healthcare aspect of the presentation, you can still learn a lot about explainable AI and why it is an important area of research in both academia and industry.

Motivation scenarios do matter in ML papers

I have been reading a ton lately and have also been reviewing a ton of papers as well — just one of the perks of being a PhD student. What I have noticed, in general, is that I can instantly appreciate papers that properly layout a clear motivation with social impact. In the past, I have been telling students that motivation is not just about advancing the state of the art, it is also about innovation and creativity and how it helps to inspire growth in society, whether it be helping disabled people or helping the elders. We need to think of these things when we are researching topics related to AI. It helps to inspire you to become a better and more responsible researcher as well.

A skeptic’s guide to thinking about AI

In an article published on Fast Company, several pieces of advice are shared on how to study AI and cut through the hype. Insights include topics such as:

  • AI is not neutral
  • “AI” usually relies a lot on low-paid human labor
  • Don’t just talk about ethics, think about human rights

Best practices in machine learning

There is a lot of content out there that you can use to learn about machine learning: from Andrew Ng’s classic machine learning course to the new fast.ai machine learning course for developers. But something that has always been important and not so well discussed is the best practices of machine learning. This short little report does an amazing job at laying out some of the best practices that you should consider when building machine learning models. Topics include things like what things to check before exporting machine learning models into production.

Machine learning code at Deep Learning Indaba

In case you missed it, here is a repository containing all of the practical machine learning material taught in this year’s Deep Learning Indaba event held in Stellenbosch University, South Africa. It teaches how to implement CNNs, LSTMs, and reinforcement learning algorithms using TensorFlow.

The Importance of NIPS for Diverse Researchers

Check out this great article published by Sebastian Anaya on the topic of diversity and inclusion in AI and why conferences like NIPS are leading the way in this effort.

CVPR 2018 GAN tutorials

In this video, you can find a list of tutorials on generative adversarial networks (GANs) given at CVPR 2018. Notable speakers include Ian Goodfellow and Alexei A. Efros. I mostly liked the part about generating art with GANs.

Requesting volunteers for dair.ai

I am hoping to convince some of you guys on the importance of the AI Diary as I discussed in the first entry of the series. I would like to invite you to volunteer in co-editing entries of the AI Diary. The idea is that we can put together our different perspectives on AI-related topics and issues. I think this is important for our community. Plus we can leverage platforms like Medium that already give us the tools we need to collaborate on ideas regardless of where we are in the world. Please reach out to me on Twitter or comment below if you are interested in being part of this initiative.

#100DaysofNLP

I have noticed a lot of machine learning students and young researchers using the tag #100DaysofMLCode on Twitter. Inspired by their cause, I have started my own hashtags related to NLP: #100DaysofNLP. Challenge yourself everyday to do something different that is related to NLP. I would usually suggest to read a paper or code something — even if it is tiny — and talk about your experience on Twitter with the hashtag #100DaysOfNLP. Hope to see some of the ideas and research you are working on.

One last quick thing: any sort of engagement (like follows, shares, 👏👏, and feedback) will make a huge difference for the future and sustainability of the dair.ai publication. So I will deeply appreciate any of that in advance.

— with 💚 from Elvis

--

--