Simulated traffic patterns
Simulated traffic patterns
Simulated traffic patterns

[I] Introduction

Traffic jams are ubiquitous in modern cities, and these jams, unlike the orange-flavor ones you have for breakfast, often irritates the hell out of people if they happen to find themselves in one. There is a huge interest in trying to understand why traffic jams happen, and how to mitigate them, because they are a big problem both socially and economically. Workers getting stuck in endless traffic wasting valuable time lead to a less productive workforce, whereas citizens trapped in their cars on their ways around the city create frustrated and unhappy residents. There are a lot of factors contributing to traffic jams, but the general consensus is that the main culprits are the small, unnoticeable errors that drivers make that will propagate and amplify over time. Imagine a smooth traffic flow on a highway when suddenly one of the drivers slow his car down just a little bit to turn on the radio. The sudden decrease in velocity causes the next driver to be cautious, as she should be, and slow her car down as well. These patterns propagate across traffic and got amplified due to the irregularities of the slowing down and speeding up behaviors of different drivers, ultimately creating a traffic jam. …


Image for post
Image for post
Fashion MNIST

The code used to generate the results in this article can be found here.

Model Formulation

There are two main paradigms in probabilistic modeling, discriminative model and generative model. A discriminative model, like a logistic regressor, learns the conditional distribution between the observation x and the label y, or p(y|x), whereas a generative model usually attempts to retrieve the data generation process, or the reverse conditional p(x|y). Generative models can be used to “create” observation from some seed related to the observation, for example generating a sentence conditioned on the topic. Generative models come in many flavors, from probabilistic models that can be sampled to produce predictive data to MLE models that can be used to generate data from random noises. …


Image for post
Image for post
Credit: Marita Kavelashvili

This is a series on ensemble reinforcement learning, containing four articles:

Motivation

Ensemble learning is a method of combining multiple learning models, such as logistic regression and naive Bayes classifier, to produce a single learner to perform inference on the data. The act of aggregating predictions from multiple models is most popular in classification models and entire schemes of classification have been developed centering this idea. Most notable examples of ensemble learning concern tree-based models (Hoeting, A. J., et al, 1999), such as random forests (i.e. growing classification trees and aggregate their predictions into a single inference); bagging (i.e repeatedly drawing bootstrap samples from a set and average the predictions for all samples) or boosting (i.e. …


Image for post
Image for post
Popular Game Environments In Reinforcement Learning (source: OpenAI, ICLR 2019)

[I] RL: A Brief Introduction

Reinforcement learning is one of the main learning paradigms in machine learning, alongside supervised and unsupervised learning. Unlike supervised learning, where we have a set of fixed labels of the true, or nearly true, values we want to approximate, reinforcement learning interacts with the environment to incrementally to learn more about what the learner should or shouldn't do through a signal commonly referred to as the reward. Unlike unsupervised learning, where we want to learn the representation of the input and extract underlying patterns, reinforcement learning aims to retrieve a strategy to effectively navigate the environment it is trained in, known as the policy. These distinctions differentiate reinforcement learning with the other paradigms and also inform us about scenarios where reinforcement learning is the most useful. It is incremental, so when supervised learning fails to have readily available labels we can use reinforcement learning instead. …

Ngoc Minh Tu Nguyen

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store