Normalizing absolute SVG paths with cubic bézier conversions

Josh Frank
Nerd For Tech
Published in
8 min readJul 25, 2021
Photo credit: 20th Century Fox

This blog post is part 3 in what’s quickly become a series about one of my favorite subjects: vector graphics—a shamefully underused way to bring your app from “meh” to “wow” with relatively little effort. Vectors are light, fast, and versatile, and manipulating/animating them is easy thanks to the main subject of this article series, <path>s (sequences of lines and curves).

On today’s episode I’m revisiting a bit of code from a month ago that I called PathParser. In that post I explained paths are defined with strings in a standard syntax; I wrote a class that parses those syntax strings into arrays of coordinates, and a rollicking good time was had by all. We’ll be adding to that code today so fork and clone the repository if you’d like to review it or follow along yourself in a new branch.

If today’s blog post is less coherent or more typo-ridden than usual, I blame a horrible sinus infection currently being treated with industrial-grade diphenhydramine, promethazine and Chum-churum. But thankfully, at least it’s NOT COVID! Thanks, Moderna!

Background: A new normal

In my previous discussions of SVG paths, I’ve explained the <path> tag can replace every other shape tag, and within <path> tags, the bézier C and c commands can replace every other command except the initial M/m. The two shapes below have the same points and appear identical:

const sameShape = [
"M 25 25 L 75 25 L 75 75 L 25 75 Z",
"M 25 25 C 25 25 75 25 75 25 C 75 25 75 75 75 75 C 75 75 25 75 25 75 Z"
];
PathParser.parse( sameShape[ 0 ] )
-> [ [ 25, 25 ], [ 75, 25 ], [ 75, 75 ], [ 25, 75 ], [] ]
PathParser.parse( sameShape[ 1 ] )
-> [ [ 25, 25 ], [ 75, 25 ], [ 75, 75 ], [ 25, 75 ], [] ]

In a normalized SVG path, all commands, except the first m/M command, have been converted to equivalent C/c commands. Path normalization is common and routine because it makes all kinds of path operations much easier. That’s because, when a path is normalized, each command’s concavity (how far a curve stretches beyond its points) may be calculated without branching for different command types.

Reduce, reuse…

Let’s call our class NormalizedPath() and fill it with a constructor and a method called parse() that gets called upon initialization:

class NormalizedPath {  constructor( descriptor ) {
this.parse( descriptor );
}
parse( descriptor ) {
...
}
}

Different commands are normalized differently, just as different commands are parsed differently in PathParser. We used a grammar in PathParser to perform the correct operations to parse a path; this time, instead of using a normalizerGrammar to do the correct math to normalize a command properly, we’ll handle it all with reduce() and a large switch() { case: }. (A little more about why later.)

If you look closely at sameShape above you’ll see clearly that each point in the normalized version of a path depends on the point previous to it. That should be a huge clue to the approach we’ll need to use! Each time we analyze a command, we need to keep track of it so we know its previous command for the next command in line:

parse( descriptor ) {
let previous = [ 0, 0 ];
this.parsedCommands = PathParser.parseRaw( descriptor ).reduce( ( result, command, index ) => {
switch( command[ 0 ].toLowerCase() ) {
...
}
}, [] );
}

reduce() is one of my favorite JavaScript aces-in-the-hole — which is why it’s such a crime that it’s so rarely emphasized in programming tutorials or coding bootcamps. When called on an array, it takes two parameters, a callback and an object/array, like so: reduce( ( result, element ) => {}, [] ). It performs the logic specified in the a callback, which takes as parameters a result and the particular element being iterated over (in this case, an individual command). It then collects the result of that logic, sticks it into the result object/array specified in reduce's second parameter, and returns it.

Take a look at the code below to see that big ol' switch() { case: } for yourself. Please note that for the sake of my sanity right now due to my exploding sinuses, this class parses only absolute paths right now:

parse( descriptor ) {
let quadX, quadY, bezierX, bezierY, previousCommand = "", previousPoint = [ 0, 0 ];
const isRelative = command => command[ 0 ] === command[ 0 ].toLowerCase();
const updatePrevious = command => {
previousCommand = command[ 0 ];
if ( command[ 0 ].toLowerCase () === "h" ) previousPoint[ 0 ] = isRelative( command ) ? previousPoint[ 0 ] + command[ 1 ] : command[ 1 ];
else if ( command[ 0 ].toLowerCase () === "v" ) previousPoint[ 1 ] = isRelative( command ) ? previousPoint[ 1 ] + command[ 1 ] : command[ 1 ];
else {
previousPoint = isRelative( command ) ? [ previousPoint[ 0 ] + command[ command.length - 2 ], previousPoint[ 1 ] + command[ command.length - 1 ] ] : command.slice( command.length - 2 );
}
};
this.parsedCommands = PathParser.parseRaw( descriptor ).reduce( ( result, command, index ) => {
let normalizedCommand;
switch ( command[ 0 ] ) {
case "M":
if ( !index ) normalizedCommand = command;
else normalizedCommand = [ "C", ...previousPoint, ...command.slice( 1 ), ...command.slice( 1 ) ];
break;
case "H":
normalizedCommand = [ "C", ...previousPoint, command[ 1 ], previousPoint[ 1 ], command[ 1 ], previousPoint[ 1 ] ];
break;
case "V":
normalizedCommand = [ "C", ...previousPoint, 0, command[ 1 ], 0, command[ 1 ] ];
break;
case "L":
normalizedCommand = [ "C", ...previousPoint, ...command.slice( 1 ), ...command.slice( 1 ) ];
break;
case "S":
let [ cx, cy ] = previousPoint;
if ( [ "c", "s" ].includes( previousCommand.toLowerCase() ) ) {
cx += cx - bezierX;
cy += cy - bezierY;
}
normalizedCommand = [ "C", cx, cy, command[ 1 ], command[ 2 ], command[ 3 ], command[ 4 ] ];
break;
case "Q":
quadX = command[ 1 ];
quadY = command[ 2 ];
normalizedCommand = [ "C",
previousPoint[ 0 ] / 3 + ( 2 / 3 ) * command[ 1 ],
previousPoint[ 1 ] / 3 + ( 2 / 3 ) * command[ 2 ],
command[ 3 ] / 3 + ( 2 / 3 ) * command[ 1 ],
command[ 4 ] / 3 + ( 2 / 3 ) * command[ 2 ],
command[ 3 ],
command[ 4 ]
];
break;
case "T":
if ( [ "q", "t" ].includes( command[ 0 ].toLowerCase() ) ) {
quadX = previousPoint[ 0 ] * 2 - quadX;
quadY = previousPoint[ 1 ] * 2 - quadY;
} else {
quadX = previousPoint[ 0 ];
quadY = previousPoint[ 1 ];
}
normalizedCommand = [ "C",
previousPoint[ 0 ] / 3 + ( 2 / 3 ) * quadX,
previousPoint[ 1 ] / 3 + ( 2 / 3 ) * quadY,
command[ 1 ] / 3 + ( 2 / 3 ) * quadX,
command[ 2 ] / 3 + ( 2 / 3 ) * quadY,
command[ 1 ],
command[ 2 ]
];
break;
case "A":
normalizedCommand = [ "C", ...arcToCubicBeziers( previousPoint, command.slice( 1 ) ) ];
break;
case "C":
normalizedCommand = command;
break;
case "Z":
normalizedCommand = [ "Z" ];
break;
default: break;
}
[ bezierX, bezierY ] = command.length > 4 ? [ command[ command.length - 4 ], command[ command.length - 3 ] ] : [ previousPoint[ 0 ], previousPoint[ 1 ] ];
updatePrevious( command );
return [ ...result, normalizedCommand ];
}, [] );
}

A few points breaking down this monster, in no particular order of importance:

  • You can see right away we start off defining a whole bunch of counters: quadX, quadY, bezierX, bezierY, previousCommand letter and previousPoint. That’s to keep track of which point we’re on and its curve control points (if/where applicable).
  • parse() contains two utility functions within its closure — mostly to avoid repeating code. Near the top, we define a quick function isRelative that tells us if a command is relative or absolute — important for properly updating the previousPoint. Inside reduce, after defining a box to stash our previousPoint, we also define a function within the same scope as parse() to updatePrevious() depending on whether the command is absolute or relative.
  • Inside the switch() in our reduce, we write the logic to parse a move or line command into a curve, depending on what the previousPoint was. Remember above I mentioned that in a normalized path, all commands except the first move command have been parsed — that’s why we leave the M command alone if ( !index ) — i.e., if we’re on the first command.
  • Curves like Q, T, S and A are special cases — we need a lot more information than just the previousPoint coordinates to parse these. That’s why have these quadX/quadY and bezierX/bezierY counters, and it’s also why I’ve decided using against a grammar — the whole point of a grammar is to simplify things, but passing every single one of these arguments to our grammar for every single command is a little cumbersome.
  • Finally, we updatePrevious and return an array with our ...result so far and our newly-normalizedCommand.

Arc of the covenant

If you looked closely at the above code, you need psychiatric help (or a job at the NSA)… and you probably noticed the case for the A command is suspiciously short:

case "A":
normalizedCommand = [ "C", ...arcToCubicBeziers( previousPoint, command.slice( 1 ) ) ];
break;

The truth is, there is no easy way to convert an arc to a cubic bézier. It’s mathematically impossible to create a bézier that looks exactly the same as an arc, and an explanation why is beyond the scope of this article or my competence (even without Benadryl).

However, it is possible to replicate an arc with multiple bézier curves. That’s exactly what libraries like Snap and Paper do… but importing a whole library like that just to toss curves around is, in the words of Monty Python, silly, very silly indeed. So, I’m using code I adapted/streamlined from the work of Colin Meinke; I previously used it in another project I’ve discussed on this blog. This approach calculates the center of an arc and then splits it into vectors with some very unpleasant math. Examine it for yourself in the repository; leave a comment and I might be inclined to do a deeper dive into arc-to-cubic conversions when I’m not so sick.

Conclusion

You might find yourself wondering here what the point of all this work is:

  • In vector animation, normalization makes calculating transitions (or “tweens”) between keyframes much easier and smoother
  • In video gaming, normalization makes it easier to calculate intersections
  • In computer graphics and machine learning applications, normalization makes it easier to calculate a shape’s bounding box the smallest possible right, perpendicular rectangle completely enclosing the shape, like this:
Bounding boxes are fascinating — leave a comment and I might be inclined to cover them in a future blog post. (Photo credit: Solo on StackOverflow)

On a final, slightly different note, I feel compelled to mention how grateful I am to be vaccinated against SARS-CoV-2, the virus that causes COVID-19! I would be much, much sicker right now if I weren’t immune! There are so many people in this world desperate for a vaccine, and if you’re fortunate enough to live in a developed country like I do, it’s very likely free of charge. There are no excuses anymore! If you haven’t already, please: get the shot if you’re able to, and save a life — maybe your own!

--

--

Josh Frank
Nerd For Tech

Oh geez, Josh Frank decided to go to Flatiron? He must be insane…