15 curated AI reads for July 2018

July 3rd, 2018

Enrique Herreros
xplore.ai
7 min readJul 2, 2018

--

News about Machine Learning (ML), Artificial Intelligence (AI), Data Science (DS) and related advanced analytics areas.

Photo by Crew on Unsplash

Welcome to xplore.ai’s first post of the 15 curated AI reads monthly series. The objective of these series is to provide the audience with a curated list of the most interesting news, publishings and tools that our team have ran into during the previous month.

15. ✏️ TensorEditor

TensorEditor Beta screenshot

TensorEditor is a visual tool that allows even non-expert users to create their own TensorFlow models with just a few clicks. The user can tweak any parameter from the Neural Network. Once the model is ready, the tool will generate the respective Python code, so you can integrate anywhere and run predictions on it.

14. 📈 Probability Paradox Notebook

Monty Hall and its paradox

Although this notebook is already more than two years old, we ran into it this month and felt we have to share it with you. “Probability, Paradox, and the Reasonable Person Principle” holds a collection of 10 Python neatly-explained solutions.

http://nbviewer.jupyter.org/url/norvig.com/ipython/ProbabilityParadox.ipynb

13. 🔎 Python unified approach to explain the output of any machine learning model. Renders in JS

Lately, more and more we see a trend in putting more effort in being able to explain ML and DL models. This is a good market sign, since it means these models made it to production and a lot of stakeholders need to understand the model’s output to take further decisions. We understand that grabbing insights at every step in the Data Science pipeline is vital. Hence, at xplore.ai, we are trying out this module in our current projects.

12. 🤼Dense Human Pose Estimation In The Wild

Facebook’s Research department has open-sources a real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body. They have open sourced the training data used: manually annotated 50K COCO images. The paper also discusses how region-based models outperform “classic” Convolutional Neural Networks. FB has used their own Deep Learning framework, Caffe2.

11. 🗣 Deep Learning for Conversational AI

PolyAI has shared an impressive 187 pages slide deck that expose the current status of Conversational AI. They give an analysis of recent research trends in deep learning for conversational AI. They also present an industry-based perspective on current deep conversational AI. It seems to be a good moment to explore new scalable ways to generate impact in this huge industry.

https://www.poly-ai.com/docs/naacl18.pdf

10. 💢 SNIPER: Efficient Multi-Scale Training

Animation from SNIPER’s repository

SNIPER is an efficient multi-scale training approach for instance-level recognition tasks like object detection and instance-level segmentation. Instead of processing all pixels in an image pyramid, SNIPER selectively processes context regions around the ground-truth objects

9. 📑 Papers with Code

This magnifique webpage shares links to research papers and their code implementations in GitHub. It even tells you the number of GitHub stars and what ML framework was used for the implementation. Don’t forget to subscribe to their weekly digest.

8. 📜 Learned in translation: contextualized word vectors

Plot by Salesforce research team

Very accurate and to-the-point explanation by the research team at Salesforce. A very good intro on word vectors is given and then they explain how different techniques such as attention, encoders, hidden vectors, attention, generation, etc can help in NLP. Then, different NLP tasks are introduced: Machine Translation, Question Answering, Classification, Inference, Entailment, etc. They have researched that, by combining the information from GloVe, CoVe, and character vectors, a significant boost of the performance of our baseline models on a variety of NLP tasks occur.

7. 👨‍⚖️ Alibaba’s AI Lawyer Assitant

This study clearly shows that AI can successfully be implemented in the legal domain and that it can even improve accuracy when dealing with large volumes of legal documents when compared to professional lawyers. 8 lawyers compared 600 online legal agreements alongside the AI. The AI reached higher levels of accuracy in a millionth of the time.

6. 🍎🍏 Detecting image similarity at Pinterest

Image from Pinterest Engineering Medium blog

In this post, Pinterest Engineering team explains how the made use of batched Local-Sensitive Hashing on top of Spark and TensorFlow to detect near-duplicate images. A wide range of smart tricks are applied to avoid having to do 10*10¹⁶ possible comparisons.

5. 🏎 ️100x faster NLP in Python

By mixing Cython and Scipy’s internal data structures, Thomas Wolf exposes different techniques to create high-speed NLP algorithms out of useful profiling and C graciousness.

4. 💅 Understanding Latent Style

Stitch Fix visual tool

Stitch Fix is known, among other things, for its attention to detail, care for design and love for data. In this short post, Erin and Jana give an introduction of the tools they used to understand styles: PyTorch, Matrix Factorization, PCA and Orthogonal Procrustes Matrix Approximation. With those tools, they were able to build a latent style space where each item can be represented by a numerical vector. This reminded me of a Zalando’s Research Fashion DNA paper.

3. 🦅🐊🐅🦍 Model Zoo

CycleGAN

This web allows their users to discover open source deep learning code and pre-trained models. Creating and training models is hard and expensive, especially because it depends on the 21st century oil, data. In an open source era it makes sense to first pay attention at what others have done and learned regarding the problem you are facing, good initiative!

https://modelzoo.co/

2. Speech Recognition with TensorFlow

MathWorks Spectogram plot

Implementation of a seq2seq model for Speech Recognition using TF. Architecture similar to “Listen, Attend and Spell”. The model is a pyramidal bidirectional LSTMs in the encoder, which reduces the time resolution and enhances the performance on longer sequences.

1. TensorFlow: The Confusing Parts (1)

Part 1 from a series of posts that will help reader get a better intuition for what TensorFlow is, how it works, and how to use it. Although the concepts presented in this tutorial are fundamental to all TensorFlow programs, it is pretty easy to end up disliking TF due to having to learn to use it the hard way. Thus, I would recommend this read to any novice to intermediate user of TensorFlow.

And this is it for what we found out to be interesting in June. At xplore.ai, we are always trying out the latest tools, experimenting with cutting edge algorithms and reading about the latest trends in every industry where data is generating unprecedented value.

If you liked the article please clap and subscribe. You can also check the other articles in our xplore.ai blog publication. You can also follow us in LinkedIn and Twitter. We hope you have a great month ahead!

--

--

Enrique Herreros
xplore.ai

Web3 and Data | Software Engineer at Electric Capital