The Best of AI: New Articles Published This Month (April 2018)

10 data articles handpicked by the Sicara team, just for you

Published in

Sicara's blog

7 min readMay 15, 2018

Welcome to the April edition of our best and favorite articles in AI that were published this month. We are a Paris-based company that does Agile data development. This month, we spotted articles about ethics in AI, the benefits of machine learning in hard science and mastering the latest data science tools. We advise you to have a Python environment ready if you want to follow some tutorials :). Let’s kick off with the comic of the month:

The Python environmental protection agency wants to seal it in a cement chamber, with pictorial messages to future civilizations warning them about the danger of using sudo to install random Python packages.

1 — Attention is all you need

We begin this journey through our favorite articles of the month by discovering and reading one of the most influential paper of natural language processing (NLP).

Papers can be intimidating to read. So I liked the author’s idea of combining screenshots from the original paper, explanations in plain English and code snippets with an actual implementation of the paper in Python.

The article can be read at different levels: from the presentation of encoders, decoders and the attention function to real-world examples including the use of regularization and GPU training.

Read The Annotated Transformer — from Harvard NLP

2— Learn data science as if you were in Berkeley

One of the world-famous universities of California — Berkleley — launches its new data science program. If you want to catch up with the foundations of data science, the three-part course sounds like a promising start.

The courses focus on the basis of programming and statistics before tackling the problem of making predictions using machine learning tools.

Read The campus news — from UC Berkeley

3— Pandas is the new Excel

Every proper Jupyter notebook begins with the import of Pandas and NumPy, as they are great tools to easily store in memory, manipulate and visualize data in Python.

Though, are you really as proficient with these tools as with a WYSIWYG Excel-like software when it comes to clean a new dataset? I was not. And this tutorial walked me through the main steps of cleaning data using Python.

Read Pythonic Data Cleaning With NumPy and Pandas — from Malay Agarwal

4—Should you use p-values?

In this blog post, I liked the author’s critical insight on one of the most used statistical tools to reject an hypothesis: the p-value. In particular, he proves that, although the p-value is bellow the well-known threshold of 5%, the actual false positive rate might still be much higher…

So it’s time for a short refreshing on correct and uncorrect interpretations of the p-value!

Read Why I’ve lost faith in p values — from Steve Luck

5—The big short

At Sicara, I am used to end-to-end approaches as I solve my clients’ business problems by building production-ready applications from backend logic to frontend data visualization. This article provides with an end-to-end tutorial on how to build a decision support tool to manage a stock market using R.

After presenting a simple analytical model, the author shows its implementation, the user interface and the deployment as a web application. Maybe, just as me, you’ll get inspired from this tutorial to build your next data proof of concept!

By the way, if you like web programming, we’re preparing a blog article on Loopback mixins — a Node framework. Don’t forget to follow us!

Read How to develop a stock market analytical tool using Shiny and R — from Sergey Malchevskiy

6—Not only Facebook can steal your data

Recent events put Facebook at the heart of the discussion on data protection. We only begin to realize the major role that web giants are playing in collecting our data.

However, also third parties can take advantage of the iconic like button. The blog post explains two vulnerabilities third parties can exploit to either retrieve extensive information from your Facebook profile or track you on the internet.

In the meantime, Richard Stallman — bearded representative of the free software community —even proposes a more radical model to avoid companies to steal our data…

Read No boundaries for Facebook data— from Freedom To Tinker

7— AI will save the world

When we think of artificial intelligence, we often imagine robots slowly taking over the world by stealing our jobs and mimicking our habits.

I’m sure you didn’t fall for this! However, it’s always good to remember how machine learning techniques can be efficient in hard science. Be it in the prevention of Alzheimer’s disease or for the discovery of new materials, AI seems of great help to drive science forward.

Read Artificial intelligence accelerates discovery of metallic glass — from Northwestern University

8— YOLO

Object detection has always been one of the main concerns of computer vision. It’s now a well-known topic. You Only Look Once (YOLO) is one of the canonical algorithm to achieve state-of-the-art object detection in photos.

This article covers the implementation from scratch of YOLO, going from the underlying theory to real-life use cases to detect objects in videos. It’s totally worth reading and the result is impressive!

Read How to implement a YOLO (v3) object detector from scratch in PyTorch — from Ayoosh Kathuria

9—Learning from a researcher

Maybe you ask yourself what an AI researcher’s typical day looks like? Maybe you consider beginning a PhD in the field? Or as an engineer you want to know the skills you should bet on in the area of AI?

Tom Silver from MIT shared his experience as an AI researcher and all the struggles he describes strangely resonate with the endeavors of any data scientist and developer. When he speaks of the importance of data visualization or the pernicious effects of hype in computer science, I’m sure you will identify to this unusual self-portrait.

Read Lessons from My First Two Years of AI Research — from Tom Silver

10 — For a meaningful artificial intelligence in France and Europe

I live and work in Paris which appears to be the epicenter of a thrilling and active AI scene. Even companies like DeepMind and Facebook chose Paris as the place-to-be for AI!

End of March in Paris, prominent researchers and actors in the field of AI gathered for the conference “AI for humanity”. This conference follows a six-month mission, led by the French mathematician and Fields-medal winner Cédric Villani.

Cédric Villani intended to build a strategy for the French government to develop AI in France and Europe. Notably, the report strongly encompasses the social and ethical issues raised by AI. Villani pledges for a more ecological and energy-efficient AI. He points out the role of data scientists in making their blackbox algorithms as fair and explainable as possible.

Read The report of “AI for humanity” — from Cédric Villani

We hope you’ve enjoyed our list of the best new articles in AI this month. Feel free to suggest additional articles or give us feedback in the comments; we’d love to hear from you! See you next month.

Read the March edition
Read the February edition
Read the January edition
Read the December edition

Read the original article on Sicara’s blog here.

Did you like this article? Feel free to comment, follow us, or contact me.

By the way, we published these articles on our blog in April

Build Your Own Cloud with Kubernetes and Some Raspberry Pi

blog.sicara.com

Market Research Using Conjoint Analysis In R

This tutorial details what Conjoint Analysis is and provides an example in R to design your own market research survey.

blog.sicara.com

How to Train your Own Model with NLTK and Stanford NER Tagger? (for English, French, German…)

This guide will show you how to implement NER tagging for non-English languages using NLTK and Standford NER tagger.

blog.sicara.com

How to Deploy a Serverless REST API in a Few minutes on AWS

Using the Serverless toolkit, Python 3.6 Lambda functions, S3 and Athena

blog.sicara.com

Introduction to Deep Q-learning with SynapticJS & ConvNetJS

An application to Connect 4 game

blog.sicara.com

The Best of AI: New Articles Published This Month (April 2018)

10 data articles handpicked by the Sicara team, just for you

By the way, we published these articles on our blog in April

Build Your Own Cloud with Kubernetes and Some Raspberry Pi

Market Research Using Conjoint Analysis In R

This tutorial details what Conjoint Analysis is and provides an example in R to design your own market research survey.

How to Train your Own Model with NLTK and Stanford NER Tagger? (for English, French, German…)

This guide will show you how to implement NER tagging for non-English languages using NLTK and Standford NER tagger.

How to Deploy a Serverless REST API in a Few minutes on AWS

Using the Serverless toolkit, Python 3.6 Lambda functions, S3 and Athena

Introduction to Deep Q-learning with SynapticJS & ConvNetJS

An application to Connect 4 game

Written by Pierre Marcenac