Validating Roman Numerals — Hackerrank

Shefali Bisht
Geek Culture
Published in
3 min readJun 14, 2022

Problem Statement: You are given a string, and you have to validate whether it’s a valid Roman numeral. If it is valid, print True. Otherwise, print False. Try to create a regular expression for a valid Roman numeral.

Input Format: A single line of input containing a string of Roman characters.

Output Format: Output a single line containing True or False according to the instructions above.

Constraints: The number will be between 1 and 3999 (both included).

Sample Input

CDXXI

Sample Output

True

Solution :-

Lets first understand the rules to write Roman numerals:-

Rules to Write Roman Numerals

  • The value of the symbol is added to itself, as many times as it is repeated. (Eg. III — 3, XXX — 30 and CCC — 300).
  • A symbol can be repeated utmost three times, for example XXX = 30, MMM= 3000, etc.
  • Symbols V, L, and D are never repeated.
  • When a symbol of smaller value appears after a symbol of greater value, its values will be added. For Example- VI = V + I = 5 + 1 = 6.
  • When a symbol of a smaller value appears before a greater value symbol, it will be subtracted. For Example- IX = X — I = 10–1 = 9.
  • The symbols V, L, and D are never subtracted, as they are not written before a greater value symbol.
  • The symbol I can be subtracted from V and X only and symbol X can be subtracted from symbols L, M and C only.
# Roman Numerals# 1    I
# 5 V
# 10 X
# 50 L
# 100 C
# 500 D
# 1000 M
# Digits - I II III IV V VI VII VIII IX X
# Tens - X XX XXX XL L LX LXX LXXX XC C
# Hundreds - C CC CCC CD DC DCC DCCC CM M
# Thousands - M MM MMM

To solve this question, we would have to break it down into components.

  1. Digits: It comprises of letters I,V and X. Notice the bold font above- I is succeeded by V and X. Also, I is preceded by only one or zero V.
    I[VX] — I (at most three) is succeeded by V and X
    V?[I]{0,3} — I is preceded by only V (zero or one)
    We combine these two patterns with an ‘OR’ condition as only one can occur at once.
    Hence, digits = ‘(V?[I]{0,3}|I[VX])’
  2. Tens: It comprises of letters X, L and C. Notice the bold font above- X is succeeded by L and C. Also, X is preceded by only one or zero L.
    X[LC] — X (at most three) is succeeded by L and C
    L?[X]{0,3} — X is preceded by only L (zero or one)
    We combine these two patterns with an ‘OR’ condition as only one can occur at once.
    Hence, tens = ‘(L?[X]{0,3}|X[LC])’
  3. Hundreds: Similarly, hundreds = ‘(D?[C]{0,3}|C[DM])’
  4. Thousands: We are given the range of 1 to 3999 (both included). So we can only have 1000, 2000 and 3000.
    Which means, thousands = ‘M{0,3}’

Now, combine all the patterns and use ‘$’ to match the end of the string (indicate the end of pattern)

regex_pattern = thousands + hundreds + tens + digits +’$’

import reinput_string = str(input())digits = '(V?[I]{0,3}|I[VX])'
tens = '(L?[X]{0,3}|X[LC])'
hundreds = '(D?[C]{0,3}|C[DM])'
thousands = 'M{0,3}'
regex_pattern = thousands + hundreds + tens + digits +'$'print(str(bool(re.match(regex_pattern, input_string))))

If you found this article helpful, share it with your friends and colleagues. If you have any other questions, you can find me on Linkedin .

--

--

Shefali Bisht
Geek Culture

Data Engineer who loves experimenting with different datasets and technologies to make your life easy and mine complex. https://www.shefalibisht.com/