ICML 2018 — Weekend of a Data Scientist

Alexander Osipenko
Cindicator
Published in
3 min readJul 20, 2018

Weekend of a Data Scientist is series of articles with some cool stuff I care about. The idea is to spend a weekend by learning something new, reading and coding.

This week was pretty busy, so I read little bit below my average. The plan for this week is to catch up!

ICML 2018 has finished and papers became available, they are pure gold for Data Scientists out there! Many different topics cover State of the Art approaches.

I highly recommend to go thought it, at least to be familiar with papers. The full list can be found here: https://icml.cc/Conferences/2018/Schedule?type=Poster

I just wanna highlight several papers I’ve read already.

  1. Markov Modulated Gaussian Cox Processes for Semi-Stationary Intensity Modeling of Events Data http://proceedings.mlr.press/v80/kim18a.html
Two event sequences from the synthetic Semi-Stn data. The X-axis is time. The top panel depicts two candidate inten- sity functions λ1(t) and λ2(t). The other two panels show two event sequences generated from the model: the curve indicates the realized intensity function by selecting each of two candidates according to the underlying Markov process, while the generated events are marked as (red) crosses on the X-axis.

In short studies proposed Markov Modulated Poisson Process model incorporated Gaussian Cox Process model, which allowed to cover two important features GP-based smooth intensity changes and major regime switches through a hidden Markov process. Paper showed two practical examples applied to Kaggle’s football data and Italy’s Earthquakes Data. Results showed superior performance on such combined model on semi-stationary data, which is always challenging for modelling.

2. Deep Reinforcement Learning in Continuous Action Spaces: a Case Study in the Game of Simulated Curling http://proceedings.mlr.press/v80/lee18b.html

The architecture of our policy-value network. As input, a feature map (Table 2 in the supplementary material) is produced from the state information. During the convolutions, the layers’ width and height are fixed at 32x32 (the discretized position of the stones) without pooling. The details of the layer information are provided described in Figure 2. Our policy network and value network are trained by a unified network. The output of the policy network is the probability distribution of each action. The output of the value network is the probability distribution of the final scores [-8,8].

I’m gathering experience on reinforcement learning, so I found this paper very interesting. Researchers used supervised learning combined with reinforcement learning and to learn game strategy with kernel-based Monte Carlo tree search within a continuous space. As a result, the model they made won an international digital curling competition (Yes, this thing exists).

3. Visualizing and Understanding Atari Agents http://proceedings.mlr.press/v80/greydanus18a.html

These agents do not attain human performance in the three Atari environments shown. We display the policy saliency in green here because it is easier to see against blue backgrounds. We omit the critic saliency. (a) In MsPacman, the agent should avoid the ghosts. Our agent is not tracking the red ghost, circled. (b) In Frostbite, the agent leaps between platforms. Our agent should attend to its destination platform, circled. Rather, it attends to the goal location at the top of the screen. © In Enduro, the agent should avoid other racers. Our agent should be tracking the blue racer, circled. Rather, it focuses on the distant mountains, presumably as a navigation anchor.

When I read the original paper on AI playing Atari games back in 2013, I was blown away! But Q-learning made huge steps since then: Deep Mind made AI that won Go championship, OpenAI made AI that can play Dota (and it probably will win against the current champions). Q-learning was kind of black-box for me.
I found this paper very interesting because it is breaking down the decision-process used by AI, authors made very explicit visualizations. As a result, papers bring solid explanations and visualizations that can make Atari agent more understandable even for non-experts.

Do you have yours favourite papers from ICML 2018? Let me know and we can discuss them!

Previous articles:

  1. Weekend of a Data Scientist — July 13th 2018 — building MVP for Twitter sentiment analysis
  2. 2. Weekend of a Data Scientist — July 6th 2018 — about Interpreting Model Predictions
  3. Weekend of a Data Scientist — May 25th 2018 — some interesting articles
  4. Podcasts for data scientist

--

--

Alexander Osipenko
Cindicator

Leading/Coaching/Building Data Science teams from the scratch