Making a clutter-free git diff on Gettext translation files

Gettext offers a nice feature to extract the translation keys from your code into a POT template file, but it’s hard to see which keys were added or removed with a simple git diff.

Here’s a diff on a 20-lines POT file:

Guess what changed? A single key!

But the other keys have been reordered and comments updated. Also, an actual POT file for a real application such as Privateaser’s web app is typically 3000+ lines long 😱

Prettifying the POT file

Let’s make our POT file more diff-friendly!

Using grep and sort, only keep the translation keys (we don’t need the comments, empty lines and empty translation messages) and print them in alphabetical order:

$ grep msgid messages.pot | sort

Plugging it into git diff

There is an option in Git, called textconv, which allows to pre-process files with a driver before doing the actual diff.

Wrap our grep/sort hack into a script:

--- /home/raphael/pocat.sh ---
#!/bin/sh
grep msgid $1 | sort

Then register it into Git (globally):

--- /home/raphael/.gitconfig ---
[diff "pocat"]
textconv=/home/raphael/pocat.sh

and tell your repository to use this new diff engine on your POT files:

--- /home/raphael/my-repo/.gitattributes ---
*.po   diff=pocat

Boom! Your diff looks much cleaner and you can clearly see which keys were added or removed:


🌴 Liked this article? 🍸 We run into dozens of similar challenges every day at Privateaser, and luckily we’re looking for talented Product & Software Engineers.