Randomization has proven to be a pretty fascinating and useful resource to use when designing algorithms. Randomization has been used to create fast algorithms for things like sorting, finding the median of a list, and more interesting problems like matrix multiplication verification, global min-cuts, finding 2-dimensional Delaunay triangulations of a point set, and tons more. The list of randomized algorithm applications is endless, making it clear how useful randomization can be in algorithm design (and for me, helps make it so much fun).
More recently, as a tiny component to my own research, I put together a simple randomized algorithm to help with testing some more complicated algorithms my advisor and I have been developing. To help explain this simple randomized algorithm, let us get some definitions out of the way. Consider a classical tetrahedron τ as shown in Figure 1 and recall that it is comprised of |V(τ)| = 4 vertices, |F(τ)| = 4 triangular faces, and |E(τ)| = 6 edges.
A marked tetrahedron τ will maintain a set of 4 marked edges, exactly one marked edge per triangular face. A consistently marked tetrahedron is one such that there exists some edge that is marked twice, i.e. marked by two triangular faces. An inconsistently marked tetrahedron is produced when all triangular facets mark a distinct edge. Figure 2 illustrates examples of the two types of marked tetrahedron, where the black triangles on each face point to the corresponding marked edge. With these definitions, the randomized algorithm of interest is:
- Independently for each triangular face of τ, randomly choose an edge to mark
- Return marked τ
This randomized algorithm to mark a tetrahedron τ is clearly quite simple. The first question we might ask ourselves after seeing this algorithm is, what is the probability p that the resulting marked tetrahedron τ ends up being consistent? A follow up question is then, what is the expected number of consistent marked tetrahedra, given we have a mesh T of tetrahedra that we mark using the RandomMarking algorithm? This latter question is easy to solve given a value for p: the expected number of consistently marked tetrahedron will be np. This implies that the big unknown, then, is to find p.
Surprisingly, the first question of finding p is one I had not bothered to answer until just recently. As for the latter question, I have had an empirical idea of the solution for a while. Thanks to a C++ implementation of mine, I have been seeing for a mesh T with n tetrahedra that about 2n/3 tetrahedra end up as consistently marked tetrahedra, implying that p is close to 2/3. The question then is, how well do these observations line up to theory?
Before finding the exact answer, recall that the event a tetrahedron τ ends up consistently marked is equivalent to the event that there exists some edge that is marked twice. By union bound, and noticing that an edge being marked twice occurs with probability 1/3², we have the probability p that a tetrahedron τ becomes consistently marked is at most |E(τ)|/9 = 6/9 = 2/3. Looks like, to a ballpark approximation, the software may be churning out results that output what is expected! So what is the exact value for p then?
Randomized Tetrahedron Marking Lemma
A tetrahedron τ marked by the RandomMarking algorithm ends up consistently marked with probability p = 51/81
There are a handful of ways one can approach this problem, but I feel one of the more simple ways is combinatorial proof by a counting argument. First, define N as the total number of ways to mark a tetrahedron and M as the number of ways to mark a tetrahedron such that it ends up consistent. Since each of the triangular faces of τ are marked independently, the number of ways we can mark a tetrahedron is equal to N = 3⁴ = 81. This follows from the fact that each of the 4 triangular faces has 3 edges it can independently choose to mark.
To find M, let us first consider the set of consistent marked tetrahedron that have a fixed edge e marked twice. The edge e being marked twice implies two of the four triangular faces have already marked an edge. This implies that there are 3² = 9 consistently marked tetrahedron that have e as being marked twice, as shown in Figure 3, since the remaining two faces can each independently choose 3 edges to mark. If we sum this result over all edges, we actually over count the number of consistent marked tetrahedron. This over counting is because for a fixed edge e that is marked twice, there exists one marked tetrahedron with e marked twice that also has another edge e’ that is marked twice. An example of this latter case is the circled tetrahedron in Figure 3 at the top left corner. With this observation, we find the number of consistently marked tetrahedra are M = 9|E(τ)| - |E(τ)|/2 = 54 - 3 = 51.
Thus, by the above counting argument, the probability that an unmarked tetrahedron τ becomes a consistent marked tetrahedron after being passed into the RandomMarking algorithm is p = 51/81. QED.
To see how the lemma’s result compares to the upper bound probability of 2/3, notice that p = 51/81 ≈ 0.629 while 2/3 ≈ 0.667, implying that p is pretty dang close to the 2/3 upper bound! So if we are given a size n tetrahedra mesh T and mark all the tetrahedra using RandomMarking, we theoretically should expect to end up with 51n/81 ≈ 0.629n consistently marked tetrahedra. This result is clearly close to the eyeball estimate of 2n/3 that my software was producing, implying the software is indeed doing what one would expect!
As a further comment, we have by Chernoff’s Bound found as Corollary 5 here that the probability that the number of consistently marked tetrahedra end up deviating by more than a δ multiple of the expected value 51n/81 is equal to at most 2 exp(-pnδ²/3) = 2 exp(-51nδ²/243). Thus, if we have a mesh of n = 1,000,000 tetrahedra, the probability that the number of consistently marked tetrahedra deviate by more than 0.5% of the expected value is at most 2 exp(-51nδ²/243) = 2 exp(-51*25/243) ≤ 0.011 since in this case δ = 0.005. This result implies that there is at most about a 1.1% chance of this event happening, which means the number of consistently marked tetrahedra will, with high probability, be very close to the expected value! With these details mentioned, the analysis is now complete for the RandomMarking algorithm!
With my analysis on the above simple randomized algorithm complete, stay tuned for the next time I throw down some fun math and memes!