I wrote the first algorithm on March 25. I wrote the last one today, on July 2. 100 days, 100 algorithms, 100 articles.

Many thanks to everyone of you for your support. You have helped me with this series more than you might think.

I also have two honourable mentions, Cristian Randieri, Phd and Pavel Rihosek. Of all the people, I appreciate your support most, thank you.

I tried to pick somewhat interesting topics and to show that even more complex algorithms can be written with ease. There definitely is plenty of room for improvements. …


For the 100th algorithm I chose a segmented Eratosthenes sieve for primes up to 10⁹ implemented in Cython. And this time the goal is to provide an algorithm that is as fast as possible.

There are several reasons why I diverged from Python and the usual type of implementation. Among others, I like breaking rules and I like optimizations. But what’s most important…

In most of the articles I have tried to provide as short and nice implementation as possible. I was aware that the algorithm could have been improved, but I didn’t do it since I would have to…


Linear programming is an area of mathematics that deals with the simplest form of constrained optimization problem — linear program. And simplex should definitely be in your toolbox if you are serious about algorithms.

Here is an example of a linear program [in standard form].

maximize: -x + 3y + 2z

subject to:
x + y + z ≤ 6
x + z ≤ 4
y + z ≤ 3
x + y ≤ 2

x, y, z 0

The goal is to maximize a linear function given a set of linear constraints.

The simplex algorithm is rather straightforward…


Romberg’s method to find a definite integral combines two formulas, extended trapezoidal rule and Richardson extrapolation, to get a good approximation in relatively low number of steps.

Let’s explain how it all works without causing a serious headache.

The simplest way to find a definite integral of function f on interval (a, b) is to use a trapezoidal rule.

trapezoidal rule

It’s the left-most chart and you can see that the formula is merely an area of trapezoid.


Locally weighted regression is a very powerful non-parametric model used in statistical learning.

To explain how it works, we can begin with a linear regression model and ordinary least squares.


Floyd-Steinberg dithering is a truly magical technique. It is supposed to fool your eye and brain to make you think that you see more than there really is to be seen.

In general, dither is method to reduce color space of an image by adding an artificial noise. The key idea is that the amount of light in an area should remain about the same.


We say that two nodes U and V in a directed graph belong to the same strongly connected component [SCC], if there exists path from U to V and path from V to U.

The algorithm that finds all SCCs is surprisingly simple and quite clever.

  1. run a depth-first search [DFS] on graph G — and remember time when DFS finished searching from the node
  2. create a graph G’ by reversing edge directions in G
  3. run DFS on graph G’ in the order given by descending times
  4. all the nodes reachable from node U in step #3 form a single…


Yesterday I have implemented an algorithm that directly relates to an implementation of parser for formal grammars. Today I have implemented the parser itself.

However, I have no use for yesterday’s work since the parser is not SLR [as you might have expected]. Instead, I have chosen Earley parser due to its interesting properties.

The one that is most important, Earley parser is able to handle both, left-recursive and right-recursive grammars, and it can even handle ambiguous grammars.

In fact, whatever context-free grammar you have, the parser will work just fine. Should I say, that’s not the case of my…


If you plan to implement own parser for a context-free grammar, construction of FIRST and FOLLOW sets will be the first algorithm you will have to spend your time on.

And since definition of formal languages and grammars is much more complicated than we need at the moment, I’ll stick to the intuitive explanation.

Context-free grammar is a set of production rules that consists of terminals and non-terminals. Terminal is a “physical” character that may occur in the parsed text. Non-terminal is a “meta” character that must occur on the left side of production rule.

In the grammar above, S


Principal Component Analysis [PCA] is incredibly useful when you need [among others] to visualise high-dimensional data. It’s also very simple to implement, but requires plenty of work beforehand.

A friend of mine, Pavel, has complained that I often don’t dive deep into the topic. And I hope that PCA will satisfy even such an eager mind.

Let’s say we have a bunch of multi-dimensional data stored in a matrix X. Each row represents a sample, and each column represents a variable.

We say that two variables are correlated if there is a linear relationship in between them. …

Tomáš Bouda

PyDev

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store