Get insights into your Telegram activity

Last week, I set up a short pipeline comprised of a Python script which received updates from the Telegram API and logged them, Filebeat to ingest the output into Elasticsearch, and a Kibana dashboard to visualise the data. Here’s what the dashboard looks like:

I’ve decided to call this project Telegrammetry — combining Telegram and telemetry — and it can be found on GitHub (in a very rough state right now):

Background

Even before my chat-analytics project for WhatsApp more than 2 years ago, I’ve always thought there were a lot of interesting insights that could be found in our conversation…


A primer on elementary cellular automata

After my previous post on PBM and PNG images and ImageMagick, I was interested in doing some more image manipulation, and decided to explore ways to create PNG files by generating visualisations of elementary cellular automata. This post provides some background information about cellular automata before that.

A cellular automaton is a is a pattern of cells on a grid, in which the state of each cell changes over multiple time steps based on certain rules.

For example, Conway’s Game of Life is a cellular automaton which consists of cells on a two-dimensional square grid. The state of each cell…


For quick and dirty HTTPS deployments

This post was originally posted on my blog at https://blog.jiayu.co/2018/08/a-simple-https-reverse-proxy/. I don’t post to Medium as often nowadays, so follow me there if you like what I write! You can find a RSS feed at https://blog.jiayu.co/index.xml, or just check in every now and then! 🙂

Over the past decade, there has been a concerted push towards HTTPS across the internet. On many platforms, setting up HTTPS is usually a single click away or even automatic, in no small part thanks to the rise of Let’s Encrypt making certificates much more accessible. …


Playing with PBM files and ImageMagick

I’ve always been a fan of geometric patterns. When I was younger I used to draw Sierpinski triangles in class after a classmate (hi Marken) introduced them to me. I was fascinated by how hexagons tessellated after playing and designing maps for The Battle for Wesnoth, an open-source, turn-based strategy game which plays out on a hex grid. I still want to buy this Ikea rug.

The SUTD logo

As a Singapore University of Technology and Design graduate, the university’s logo also intrigued me greatly:

According to the SUTD identity guidelines,

The simple, clean lines with no enclosures symbolize an open culture. The…


In case your corporate proxy blocks gists

GitHub Gists are extremely handy for sharing self-contained chunks of information like code snippets, short scripts and even complete articles.

However, I recently found myself working behind a corporate proxy which blocked access to gists, probably because of data loss prevention concerns. As a result, when searching online for answers I would often come across promising looking gists in my search results, but would also be frustratingly unable to view their contents.

The solution

After some investigation, I noticed that while gist.github.com was blocked, other GitHub domains such as gist.githubusercontent.com or api.github.com were not. …


Analyse FPL data with SQL

Recently, I downloaded some data from the FPL website and ingested it into a SQLite database so it could be queried and explored with SQL and potentially exported to other formats such as CSV afterwards.

Here are some things you can do with it:

https://asciinema.org/a/183831

An example query selecting the top point scorer from each team:

Check out some more sample queries and results.

You can download the SQLite database file here or from GitHub together with the code used to create it:

Motivation

I’ve been working on figuring out the optimal set-and-forget team (Some examples of set-and-forget teams: here, here


Applying Benford’s law to submission scores and number of comments

Benford’s law is the observation that in many numerical datasets, the distribution of leading digits is not uniform — the first digit of any number in the dataset is much more likely to be a 1 than a 9 (30.1% vs 4.6% for numbers in base-10).

Here’s a plot for the expected frequencies for digits from 1 to 9 for base-10 numbers:

These expected frequencies can be calculated for any base with the formula log_b(1 + 1/d) where b represents the base and d is a digit in [1, b).

Many real world datasets can be empirically shown to follow…


Fun with zero-width spaces in WhatsApp

On Reddit or other forums, you’ll often come across spoiler tags. They’re used to discuss spoilers while protecting other readers who do not wish to be spoiled by requiring them to actively interact with the spoiler tag to view its contents.

Conspicuously, spoiler tags are a missing feature in most instant messaging applications, even though it seems just as easy to get accidentally spoiled by a single careless message in a group conversation:

Oops

Fortunately, in WhatsApp at least, we can emulate a spoiler tag by taking advantage of the way long messages are hidden behind a “Read more” fold:


Moving away from Medium (but not too far)

In June last year, I started a blog on Medium and wrote my first post on the highly public Lee family feud over their family home which was taking place around that time. It was something I decided to write on a whim that day, after a friend showed me the news that morning.

My first blog post

Despite the provocative topic, it was really just a post about how to download comments from the Facebook Graph API and perform some sentiment analysis using the Google Cloud Natural Language API. Nevertheless, I was surprised by the reception my post received…


Generating Reddit comments with Markov chains

Previously, we generated some new sentences from a small pool of existing sentences. Now, we’ll generate some new comments based on all the existing comments in a Reddit thread!

This post is a continuation of “A Primer on Markov Chains”, which introduces Markov chains and how they can be used to generate text:

r/singapore viewed from space

After reading the daily r/singapore random discussion and small questions thread last week, I wondered if it would be possible to generate comments which looked like they came from there. Now, we’ll be doing exactly that with today’s thread:

We’ll use the Python Reddit API Wrapper…

Jiayu Yi

Follow me at https://blog.jiayu.co!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store