Last week, I set up a short pipeline comprised of a Python script which received updates from the Telegram API and logged them, Filebeat to ingest the output into Elasticsearch, and a Kibana dashboard to visualise the data. Here’s what the dashboard looks like:
I’ve decided to call this project Telegrammetry — combining Telegram and telemetry — and it can be found on GitHub (in a very rough state right now):
Even before my chat-analytics project for WhatsApp more than 2 years ago, I’ve always thought there were a lot of interesting insights that could be found in our conversation…
After my previous post on PBM and PNG images and ImageMagick, I was interested in doing some more image manipulation, and decided to explore ways to create PNG files by generating visualisations of elementary cellular automata. This post provides some background information about cellular automata before that.
A cellular automaton is a is a pattern of cells on a grid, in which the state of each cell changes over multiple time steps based on certain rules.
For example, Conway’s Game of Life is a cellular automaton which consists of cells on a two-dimensional square grid. The state of each cell…
This post was originally posted on my blog at https://blog.jiayu.co/2018/08/a-simple-https-reverse-proxy/. I don’t post to Medium as often nowadays, so follow me there if you like what I write! You can find a RSS feed at https://blog.jiayu.co/index.xml, or just check in every now and then! 🙂
Over the past decade, there has been a concerted push towards HTTPS across the internet. On many platforms, setting up HTTPS is usually a single click away or even automatic, in no small part thanks to the rise of Let’s Encrypt making certificates much more accessible. …
I’ve always been a fan of geometric patterns. When I was younger I used to draw Sierpinski triangles in class after a classmate (hi Marken) introduced them to me. I was fascinated by how hexagons tessellated after playing and designing maps for The Battle for Wesnoth, an open-source, turn-based strategy game which plays out on a hex grid. I still want to buy this Ikea rug.
As a Singapore University of Technology and Design graduate, the university’s logo also intrigued me greatly:
According to the SUTD identity guidelines,
The simple, clean lines with no enclosures symbolize an open culture. The…
GitHub Gists are extremely handy for sharing self-contained chunks of information like code snippets, short scripts and even complete articles.
However, I recently found myself working behind a corporate proxy which blocked access to gists, probably because of data loss prevention concerns. As a result, when searching online for answers I would often come across promising looking gists in my search results, but would also be frustratingly unable to view their contents.
Recently, I downloaded some data from the FPL website and ingested it into a SQLite database so it could be queried and explored with SQL and potentially exported to other formats such as CSV afterwards.
Here are some things you can do with it:
An example query selecting the top point scorer from each team:
Check out some more sample queries and results.
You can download the SQLite database file here or from GitHub together with the code used to create it:
Benford’s law is the observation that in many numerical datasets, the distribution of leading digits is not uniform — the first digit of any number in the dataset is much more likely to be a 1 than a 9 (30.1% vs 4.6% for numbers in base-10).
Here’s a plot for the expected frequencies for digits from 1 to 9 for base-10 numbers:
These expected frequencies can be calculated for any base with the formula
log_b(1 + 1/d) where
b represents the base and
d is a digit in
On Reddit or other forums, you’ll often come across spoiler tags. They’re used to discuss spoilers while protecting other readers who do not wish to be spoiled by requiring them to actively interact with the spoiler tag to view its contents.
Conspicuously, spoiler tags are a missing feature in most instant messaging applications, even though it seems just as easy to get accidentally spoiled by a single careless message in a group conversation:
Fortunately, in WhatsApp at least, we can emulate a spoiler tag by taking advantage of the way long messages are hidden behind a “Read more” fold:
In June last year, I started a blog on Medium and wrote my first post on the highly public Lee family feud over their family home which was taking place around that time. It was something I decided to write on a whim that day, after a friend showed me the news that morning.
My first blog post
Despite the provocative topic, it was really just a post about how to download comments from the Facebook Graph API and perform some sentiment analysis using the Google Cloud Natural Language API. Nevertheless, I was surprised by the reception my post received…
Previously, we generated some new sentences from a small pool of existing sentences. Now, we’ll generate some new comments based on all the existing comments in a Reddit thread!
This post is a continuation of “A Primer on Markov Chains”, which introduces Markov chains and how they can be used to generate text:
After reading the daily r/singapore random discussion and small questions thread last week, I wondered if it would be possible to generate comments which looked like they came from there. Now, we’ll be doing exactly that with today’s thread:
We’ll use the Python Reddit API Wrapper…