Making Sense of Roman Numerals

Kelvin Tan
The Startup
Published in
5 min readSep 12, 2020
Dietmar Rabich / Wikimedia Commons / “London (UK), Elizabeth Tower, -Big Ben- — 2010–1979” / CC BY-SA 4.0

In today’s world, we take for granted the numerical system that we use on a daily basis. The average human being is born with ten fingers, hence a counting system on base 10 is very natural to us.

We have exactly 10 elementary numerical symbols, ranging from 0 to 9. Every other natural number can be expressed in a permutation of these 10 symbols. I was teaching my eldest daughter addition with change, and she was able to keep up with the lesson. This serves as proof that a five-year old would readily grasp numbers in base 10, and would be able to add any two natural numbers to form a third natural number using no more than the 10 elementary numerical symbols.

What if human beings were born with other than 10 fingers? What if an alien race were born with 8 digits in total on both hands instead? Would they be computing in base 8?

Sometimes, even when we have 10 fingers on 2 hands, the fact that it is possible to consider just the 5 fingers on 1 hand, civilisation could reach a different conclusion and decide to count in base 5. The Mayans used both fingers and toes to develop a base 20 system.

Today’s post is inspired by this challenge on Edabit. So if you want to attempt to solve the puzzle by yourself beforehand without any spoilers, do not read any further.

The Romans had a very interesting counting system that is still in use today. The system combines elements of 5s and 10s, as well as 5 less 1s and 10 less 1s. This logic is extended by each power of 10. So, think of 50s and 100s, 50 less 10s and 100 less 10s. And so on and so forth.

When we break the counting method down to individual elements, we can group them into two distinct sets of data. In this example we arrange them into two Python dictionaries.

In the first dictionary, we define what each latin numeral represents. Unlike Arabic numerals, which have 10 symbols, the Romans made do with 3 at each 10s. So at its simplest, you could constitute any number from 1 to 10 using just, I, V, and / or X. So II would represent 2, and VII would represent 7.

The counting system also had a minus-one feature, where IV would represent 4, or V less I. This necessitates the second Python dictionary, which handles these minus-one examples, for both 5s, 10s, or its 10-multiples. So XL would be 50 less 10 to make 40, while CM would be 1,000 less 100 to make 900.

With this basic understanding, we set off to write a program that would be able to convert a modern-day Arabic numeral into its Roman equivalent, and vice versa.

So — I know we have not written these yet — let us imagine that we already have one program called convert_numeral_to_roman and another one called convert_roman_to_numeral. We can deduce the nature of the conversion by looking at the input’s data type. If a Roman numeral is fed into the program we want to write, it would be a string of Latin letters. If it were modern-day numerals, it would be an integer. So we could write the function-in-question like this:

While we have not written both sub-programs, it is good to jot these two lines down first so we know there are two to dos in order to make the program work.

So for the first part, how do we parse a latin numeral into its modern-day equivalent? One way which I found works is to translate the numbers from left to right, adding on as we go along. It is important to start from the minus-one dictionary first. As two elements are analysed at the same time, this resolves the disambiguation problem of the machine misinterpreting the Roman letters for its individual parts. For instance, had we used the first dictionary, IV would be read as 1 plus 5 equals 6, instead of 4.

It is only when the program is not able to find the pattern in the first two elements of the Roman numeral that we would resort to analysing just the first element. In Python, this could be done by purposely causing the program to throw an error by asking it to search for a key that does not exist in the minus-one dictionary, redirecting the code using the try-except arrangement. Once we are in the exception handing block, we change our analysis to just the first element of the latin string.

As the elements get analysed, we add the sum of the parts and successively shorten the remainder of the Roman numeral. The answer is found when there are no more latin letters left to analyse.

What if a number is fed to the program? This time, we would have to translate the number into its constituent Roman letters — starting from the largest possible value.

The dictionary in use here would be a merger of the two original dictionaries since we only need one here.

This time the access would go in reverse — the modern-day integers would be used to determine the Roman letters. So we use dictionary comprehension to swap the keys and values, utilising the items method native to the dictionary object in the process. Basically, we could think of dictionary.items() as a list of key-value tuples. The technique is a must-know in the art of dictionary manipulation.

Once the combined dictionary with integer keys is in place, we can start encoding the Roman letters — exhausting the largest-value letters first before descending down the chain. This is why we loop by accessing the keys, sorted by largest to smallest. We keep encoding until the remainder is 0 and we subsequently return the answer.

If you have followed this far, I hope that you would agree that while MMXX has not been easy — let us continue to do the best we can and emerge stronger once MMXXI comes along.

--

--