Removing label overlapping from pretty charts

Adarsh Lilha
8 min readMar 6, 2018

Very often I used to encounter pretty charts which would make me wonder how can this be built using code? I started off with native SVG, Canvas API but then I came across D3 which exhibits phenomenal concept of joining data with DOM elements. D3 describes itself as —

“A JavaScript library for producing dynamic, interactive data visualizations in web browsers with HTML,SVG and CSS”

Since I was already into Javascript, this seemed fantastic. Really? Javascript can do this? I thought. Well JS is everywhere, so hands down it can!

I picked up the basics of D3 - scales, transitions, selections, voronoi and built few small charts to understand it better.

But whenever I had gone ahead to draw a chart in browser, one issue inherently made me go crazy was handling labels overlapping. Such a common problem, but no directed solution. I tried to devise algorithm to handle this but either I was not going anywhere or it was never reaching to a solution (approximation problem, as you might have guessed rightly).

This article is mostly about me taking you on my learning journey. So if you feel I am wrong, or there are better ways to address this, please feel free to comment to the thread.

Before we start

In this article I start off with building a simple bubble plot from a real dataset and address the label overlapping elephant. You can find the complete code here.

Lets sit and draw

I always wanted to create this myself as I can connect with the paradigm so well. So here I will try to replicate the same chart to the best of my ability.

I grabbed the csv dataset from the website and started off with the most atomic components of any chart.

  • Scales
  • Marks

Here is a glance of the code which creates scale. If you already know this, feel free to jump to the next section.

After we’re up with scales and axes, we append them to the SVG.

Appending yAxis to svg

Since we are done with scales, its time to draw the marks.

We add circles, define the x position (cx), y position (cy), radius (r) and tweak few other encoding channel (fill) and cosmetic (stroke)

We now see bubbles corresponding to movies positioned by scales.

Won’t it be great if we are able to understand which circle represents which movie?

Time to add the labels besides each circle! Let’s start off by adding text elements besides each circle.

This is what I was talking about. The labels are getting overlapped and we end up with a gibberish looking chart. I wish d3 had given a way to fix this. But if you think logically it’s not d3’s job to fix this. Since we are on our own, let’s use an overlap reducing algorithm!

Let’s tame the dragon

We use the concept of simulated annealing to remove overlaps.

To understand what annealing is and how it works for us here, go through this paper.

The idea is something like this. You have to have a quantifiable value which is representative of the label position’s quality. This quality is function of overlapping, distance, orientation etc. We will try to minimise the value to find the best position for our label. There are different method for minimisation. Simulated annealing is one of the many useful one.

Take your time to read and understand the concept of how annealing works, Not the code yet but the mechanism of energy, weights and cooling and how can they be used together. It’s not simple, but not very difficult also. If you have understood 10% of it then you probably have mastered it.

This is an implementation of the same annealing technique we read above. We will now have a deeper look at the algorithm.

Understanding the flow

d3.labeler()
.label(labelArray)
.anchor(anchorArray)
.width(width)
.height(height)
.start(2000);
  • label takes an array of objects with x, y, name, width, height (labelArray)
{ x: 115, y: 580, name: “Walk Hard”, width: 30, height: 8 }
  • anchor takes in the array of objects with x, y, radius of circle (anchorArray)
{ x: 105, y: 600, r: 7 }
  • width and height of the canvas. If a label crosses the boundary, it’s position remains unchanged.
  • function start initiates the annealing process. The parameter it gets (2000) is the iteration it does find out the global minima of the annealing process.

Label.js is the one which does the magic.

Annealing Algorithm

Algorithm for removing overlaps with annealing

All its saying is, tweak the position of a label keeping every other labels fixed. This tweaking is what we are calling configuration in the algorithm. For the new configuration decide whether the new one is better than the previous one.

Lets explore the algorithm block by block

Steps

  • First it all starts off by either rotating or moving(translating) labels to a new position.
  • Energy is calculated before and after the change using the current position(x, y) and weightage of the overlap.
    Energy is a function which gives the value of the quantitive measure using which we can qualify the quality of the placement.
  • If change in energy < 0, keep the new position of the label. It’s better than what we had in the last configuration.
  • If change in energy > 0, then for
    - higher value of T introduce a higher probability of accepting it
    - lower value of T introduce a lower probability of accepting it
    The exponential term ensures the dynamic probability assignment to the system and random number is used to have a randomness in acceptance of the probability.
  • Decrease the temperature slightly for each iteration. We have initially set it to 1. More the temperature, higher the changes of accepting the new position. Initially annealing accepts all changes and slowly optimises again by lowering temperature.

Understanding the code

Open the label.js file and have a look. A couple of small functions - width, height, label and anchor are self explanatory. They just initialise the global variables with the values passed by the user.

Next is the start function, this is where everything starts. We assume temperature to be 1.0 and loop a finite amount of times, inside which we iterate the labelArray and then either perform a move or rotate. Also in the end we decrease the temperature by a small amount.

mcmove function

  • Take any random index of the array
var i = Math.floor(Math.random() * lab.length);
  • Store it’s (x, y) co-ordinates in two variables
var x_old = lab[i].x;
var y_old = lab[i].y;
  • Calculate energy
var oldEnergy = energy(i);
  • Move label to a new position
lab[i].x += (Math.random() — 0.5) * max_move;
lab[i].y += (Math.random() — 0.5) * max_move;
  • If new position exceeds the boundary, go back to old position we stored in step 2
if (lab[i].x > w) { lab[i].x = x_old; }
  • Calculate energy again
var newEnergy = energy(i);
  • Check if
Math.random < e^((new energy - old energy) / temperature)
  • If yes, do nothing as we have already moved to new position.
    - if not, move back to old position using the (x, y) we stored in step 2.

In the above steps we calculated energy twice. Let’s unravel it now.

energy function

Here we increment total energy value for whatever overlapping cases we want to handle and return the total.

For label-label overlaps

  • We store the values of four corners of the rect bounding the label in x21, y21, x22 and y22 variables
var x21 = lab[index].x,
y21 = lab[index].y — lab[index].height + 2.0,
x22 = lab[index].x + lab[index].width,
y22 = lab[index].y + 2.0;
  • Iterate over all the labels and calculate the rect corners of all other labels and also find out the area of overlapping.
for (var i = 0; i < m; i++) {
if (i != index) {
//label-label overlap
//positions of 4 corners of rect bounding the text
x11 = lab[i].x,
y11 = lab[i].y — lab[i].height + 2.0,
x12 = lab[i].x + lab[i].width,
y12 = lab[i].y + 2.0;
x_overlap = Math.max(0, Math.min(x12, x22) — Math.max(x11, x21));
y_overlap = Math.max(0, Math.min(y12, y22) — Math.max(y11, y21));
overlap_area = x_overlap * y_overlap;
ener += (overlap_area * weight_label);
}
  • Multiply the overlapped area with weight_label and add it to energy

Weights represent the intensity of the label overlap in the chart. More the weight, higher the intensity, so that label will go to a new position.

var weight_label = 30.0,
weight_label_anc = 30.0,
weight_len = 0.2;

We have given more weightage of two labels overlapping and a label overlapping with an anchor(circle), than label overlapping with it’s link line.

After this, repeat the same process for other overlaps, ex - label-anchor(circles) overlaps, adding to energy.

If we have an overlap to be fixed, we are adding a value to overlap which evaluates the condition

(Math.random() < Math.exp(-delta_energy / currTemp))

to be true, accepting the new position of the moved label.

Cooling Temperature function

Lastly, we also decrease the temperature by a small amount on each iteration according to the algorithm.

Wrapping Up

Finally, we move back to our html file

  • create an array with x, y, radius values of circles(AnchorArray) and
  • create second array with x, y, name, width, height, values of labels(labelArray)

and pass them to our annealing code (label.js). This is pretty straightforward so have a look.

As a touchup, we add some styling to our chart, lines connecting labels to the circle it represents and also render the labels and lines with new position with some transition

labels
.transition()
.duration(1500)
.attr(“x”, (d) => d.x)
.attr(“y”, (d) => d.y);

Once done, this is what we get!

Removing overlaps with transition

nsweeps why?

At this point one question pops up in my head-

“Why do we pass 1000 to start function and iterate 1000 times only. This is not what the algorithm says.”

Yes, at first this may seem weird but it is required because if the dataset grows big, it may not be possible to remove overlaps. At that point the program may get stuck in an infinite loop. This is done to get the optimum output and prevent that from happening simultaneously.

References and Resources

Contribute

Would like to improve the technique or code? Feel free to reach out.

Thanks for reading, If you liked the article, click the 💚 below so other people will see it on Medium.

--

--