MLaaS, DON, The Joy of AI, Jupytext, AI Next, MTNT, GAN Lab, What-If Tool, Dataset Search,…

elvis
DAIR.AI
Published in
5 min readSep 11, 2018

Great day awesome people and welcome to the 29th Issue of the NLP Newsletter! I am Elvis from Belize, Editor of DAIR.ai, and a PhD researcher in AI and NLP. Here is this week’s notable NLP news: a new machine translation benchmark dataset; AI in health, ethics, and contextual reasoning; reducing gender bias in datasets; several efforts to improve image captioning systems; how and why language models are rapidly advancing text analysis, and much more.

On People…

The organ donation process is now being heavily automated using AI technologies. In a recent article, it was reported that thousands of living kidney donors are already being identified by these algorithms. The process that also needs help is to decide the waiting list for kidney transplants using AI decision making tools — link

DARPA is said to be investing up to $2 billion on the AI Next initiative, which aims to enable machines with contextual reasoning and problem solving capabilities — link

A group of researchers from the University of California develop a new and novel method that preserves gender information in word vectors while “compelling other dimensions to be free of gender influence.” Their paper is entitled “Learning Gender-Neutral Word Embeddings” and aims to reduce gender bias present in language datasets — link

BBC recently hosted a program where AI ethics and morality are discussed with AI experts from all over the world. Some contributors to the program include Mustafa Suleyman, Nick Bostrom, and other students — link

A NY Times article discusses the potential of AI to help people communicate and connect again — link

On Education and Research…

Text analysis is advancing rapidly thanks to recent unsupervised methods used to train language models on unlabeled language data. fast.ai, OpenAI, and Allen Institute of AI are at the forefront of this type of technology which is already capable of performing complex NLP tasks such as language comprehension and sentiment analysis — link

Luis Serrano releases video lecture on matrix factorization and how this technique can be used for Netflix movie recommendations — link

New paper presents an NLP-method to combine disparate resources and acquire accurate information about health providers — link

MIT’s CSAIL team has developed a system named Dense Object Nets (DON), which is able to generate 3D visualizations and descriptions for objects it has never seen before — link

Lecture material is available for Sebastian Raschka’s new ML course given at the University of Washington Madison — link

A paper accepted at EMNLP 2018 presents a new benchmark dataset for machine translation of noisy text (MTNT), consisting of Reddit comments and professionally sourced translations. It differs from previous datasets which are mostly synthetically generated — link

Learn more about tools, such as iris.ai and Dimensions.ai, that want to make it possible to search for scholarly text using modern NLP and ML techniques — link

A new paper proposes “Hierarchical CVAE for Fine-Grained Hate Speech Classification”, a method for understanding hate speech across 40 hate groups and 13 different hate categories — link

Here is the line-up for this year’s Bayesian deep learning workshop hosted as NIPS conference. The theme for this year is “deep learning uncertainty in real-world application” — link

Google AI releases What-If tool, a new Tensorboard feature to help users better understand their machine learning models without writing code — link

On Code and Data…

Google releases Dataset Search, a platform to quickly and efficiently search and find open datasets which have been uploaded to public sites such as personal websites and university profiles — link

Yandex School of Data Analysis (YSDA) releases material for their new NLP course (GitHub repo) — link

Jupytext is a Jupyter plugin that reads and writes notebooks as plain text files. It supports languages such as R, Julia, Python, and Markdown, among others — link

Google releases a new image captioning dataset called Conceptual Captions. This dataset was released as part of a research paper presented at ACL 2018 — link

Microsoft releases a speech corpus dataset for Indian languages to help researchers build better speech-based technologies — link

Learn everything you need to know about Google Colaboratory and how to get started in this guide written by dair.ai — link

In an effort to build more representative ML models and promote inclusiveness in AI, Google AI announces the Inclusive Images Competition on Kaggle. The challenge is to build robust image captioning tools that work even for images that contain underrepresented groups based on the Open Images datasetlink

GAN Lab is a visualization tool built on Tensorflow.js that teaches how GANs work and learn — link

On Industry…

The HR Technology conference will host a number AI-based human resources (HR) technologies from big companies such as Google and IBM. Technologies that can automatically hire have been of great demand recently and the race is on to build the most powerful and intelligent conversational bots to solve HR using machine learning and NLP — link

Learn about the data science behind the recommendation system used by Feedly — link

Learn why machine learning as a service (MLaaS) is the next phase of machine learning and how cloud services are aiming to integrate ML services into their infrastructure and making it easier for other companies and services to adopt ML to their own business — link

A recent article discusses why AI-assisted image and video search is the next frontier. Find out how companies like Panopto and Google are using NLP and machine learning to build applications with powerful cataloging and search capabilities — link

Worthy Mentions…

The Joy of AI is an episode aired by BBC where AI experts discuss how AI is changing our world and challenging our ideas of intelligence and consciousness — link

dair.ai releases new article on how to take your data science writing to the next level — link

In this article you can learn the differences between virtual assistants and chatbots — link

An online TensorFlow handbook available in both English and Chinese; it is based on eager execution to help developers get started as quick as possible with TensorFlow — link

If you spot any errors or inaccuracies in this newsletter please comment below.

--

--