It has been more than three years since I started the non-profit Houston Machine Learning meetup at the end of July 2016. It’s definitely an amazing experience organizing such a great local machine learning community. I often get asked, “why do you organize this meetup?”. Well, simply put, I want to make my weekend more meaningful: learning something new and meeting new friends. And actually, there is never a plan, things just gradually work out themselves. But still, there is a lot more to tell and I would love to share this journey with you this Labor day!

In the first meetup on July 23, 2016, I talked about “A tour of machine learning algorithms”. Everything just got kicked off! I was learning how to become a good presenter in ML.

Learning through presenting

Learning through…


Thanks to my employer PROS (NYSE: PRO, https://pros.com/), I have got the opportunity to attend KDD again this year. My first KDD experience was in 2014 when I was a graduate student who achieved the 5th rank in KDD cup with three other awesome Kagglers. At that time, I was more into research track of clustering, data visualization and feature selection, which are closely related to my thesis. After working in industry, my focus shifts to time series forecasting, recommender, explainable AI etc. It is really interesting to see how KDD evolves during these years. In 2014, deep learning was…


What makes a great Halloween candy? Let’s use some ML trick and make our best Halloween candy and most importantly, use TPOT (auto machine learning) and LIME (explaining any complex ML model). The original codes have been published in github: https://github.com/YanXuHappygela/HalloweenCandy

Halloween Candies

The original data can be found at: https://github.com/fivethirtyeight/data/tree/master/candy-power-ranking . It has characteristics of 85 candies, such as whether the candy has chocolate or fruit, whether it is a hard candy or a bar etc. The target we are trying to predict is the ‘winpercent’, which describes the overall win percentage according to 269,000 matchups.

We added two features to…


Recurrent neural network is gaining popularity in sequence modeling such as natural language processing and time series forecasting. Sometimes, it is not as straight forward as convolutional neural network and difficult to fully understand. Through reading blogs and taking coursera courses (deep learning specialization), I’m coming up with four figures hopefully helpful for you to understand RNN better. The figures in this blog is built upon the figures from Colah’s blog: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

The goal of this blog is to answer four main questions in RNN:

  1. What is a RNN unit/cell?
  2. What is called a gate?
  3. How data dimensions are transformed…

Yan Xu

PhD, Machine learning scientist, Data artist and Pianist

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store