Coding the Impossible: Palindrome Detector with a Regular Expressions

Tony Tonev
Feb 6, 2020 · 6 min read
Image for post
Image for post
Photo by Andre Mouton on Unsplash

I came across a Stack Overflow question which asks how to check if a string is a palindrome using regular expressions. The top answer with 147 upvotes points out that it is impossible, so there’s no point to even try. Well, technically he’s right for arbitrary-length palindromes, but that doesn’t mean we can’t make one for palindromes up to a maximum length. As a side note, there are much easier ways to check for palindromes, such as to reverse the string and compare it to the original. So why bother? For the mental challenge and to work out my regex skills. I wanted it to work on my favorite palindrome: “A man, a plan, a canal, Panama!” so I wrote a regular expression which detects palindromes up to 22 characters ignoring tabs, spaces, commas, and quotes. I use JavaScript style regex.

Here’s the solution

\b(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*\11?[ \t,'"]*\10|\10?)[ \t,'"]*\9|\9?)[ \t,'"]*\8|\8?)[ \t,'"]*\7|\7?)[ \t,'"]*\6|\6?)[ \t,'"]*\5|\5?)[ \t,'"]*\4|\4?)[ \t,'"]*\3|\3?)[ \t,'"]*\2|\2?))?[ \t,'"]*\1\b

You’re probably thinking that either a cat walked on my keyboard or this is the secret incantation to summon Cthulhu in some long forgotten tongue. Nope, it’s actually a valid regex. Feel free to check it using regexr.com, which is an awesome tool to instantly see the results of your regex’s and explains what each part does. I’ll walk you through how I got to it so it makes sense.

First things first, we need to detect if a character is repeated.

(.)\1
Image for post
Image for post
The text matching the regex is highlighted.

Easy enough. Two characters repeated? The new part is in bold.

Old: (.)     \1
New: (.)(.)\2\1
Image for post
Image for post

OK, this only works for palindromes that are exactly 4 characters long. We can make it work with 3 characters palindromes if we make the second look-back optional.

Old: (.)(.)\2 \1
New: (.)(.)\2?\1

Cool, now we can make it work for 2 character strings by making the inner part optional. The (?: … ) creates a non-capturing group.

Old: (.)   (.)\2?  \1
New: (.)(?:(.)\2?)?\1
Image for post
Image for post

We’re on a roll, let’s make it longer.

Old: (.)(?:(.)           \2?)?\1
New: (.)(?:(.)(?:(.)\3?)?\2)?\1
Image for post
Image for post

Uh oh… We managed to detect 5 and 6 character strings, but we lost 3 character strings. The problem is \2 is not optional, and so for example, with aba when it gets to the 3rd char, it’s looking for a second b char, and not finding it. This is still a valid palindrome since the b is exactly in the middle of the string. All odd length palindromes will have a single, non-repeated character in the very middle. We can fix it. Let’s make the \2 optional!

Old: (.)(?:(.)(?:(.)\3?)?\2 )?\1
New: (.)(?:(.)(?:(.)\3?)?\2?)?\1
Image for post
Image for post

Yay! Wait… now abca is a false positive 🤦. Now here is the trickiest bit of all. If you’re up for a challenge, take a moment to try to figure out how you would fix it before scrolling down to see the answer.

Image for post
Image for post
Photo by Juan Rumimpunu on Unsplash

We really wish we could use an if statement to replace \2 with \2? only if there is no \3, but regex has no if statements… BUT, it does have an OR operator which we can use to get the same result.

Old: (.)(?:(.)(?:(.)\3?)? \2? )?\1
New: (.)(?:(.)(?:(.)\3?\2|\2?))?\1
Image for post
Image for post

Jackpot! It’s a subtle change that makes all the difference. On one side of the OR I have (.)\3?\2 which means “if there’s a new inner char (or two of the same), we must repeat group two” and on the other side we have \2? which means “if there is no new inner char, optionally repeat the second group.” The second case indicates that we are currently in the very middle of the palindrome, so it’s OK to have only one of the middle char, or there could be two. That’s why we need the ?. Otherwise, group 2 must be present for it to be a valid palindrome. Can we keep going? Yup!

Old: (.)(?:(.)(?:(.)            \3? \2|\2?))?\1
New: (.)(?:(.)(?:(.)(?:(.)\4?\3|\3?)\2|\2?))?\1
Image for post
Image for post

Nice! I simply replaced \3? with (?:(.)\4?\3|\3?), which is the same logic as before. Now we can keep replacing the inner group to make a regex palindrome detector of any length! The only catch is the length of regex will also keep growing, since we can’t write a loop inside a regex. Let’s make one big enough to detect my target sentence.

(.)(?:(.)(?:(.)(?:(.)(?:(.)(?:(.)(?:(.)(?:(.)(?:(.)(?:(.)(?:(.)\11?\10|\10?)\9|\9?)\8|\8?)\7|\7?)\6|\6?)\5|\5?)\4|\4?)\3|\3?)\2|\2?))?\1

Even though whitespace is not ignored in regex, I’ve added new lines and indentation to make this mess a little easier to read, but remember the top expression is the one that works.

(.)(?:
(.)(?:
(.)(?:
(.)(?:
(.)(?:
(.)(?:
(.)(?:
(.)(?:
(.)(?:
(.)(?:
(.)\11?\10|\10?)
\9|\9?)
\8|\8?)
\7|\7?)
\6|\6?)
\5|\5?)
\4|\4?)
\3|\3?)
\2|\2?))?
\1
Image for post
Image for post

It works! Now the hard part is done. For a victory lap, let’s make the palindrome character only alphanumeric and independent words. I’m using the case-insensitive flag.

\b(\w)(?:(\w)(?:(\w)(?:(\w)(?:(\w)(?:(\w)(?:(\w)(?:(\w)(?:(\w)(?:(\w)(?:(\w)\11?\10|\10?)\9|\9?)\8|\8?)\7|\7?)\6|\6?)\5|\5?)\4|\4?)\3|\3?)\2|\2?))?\1\b
Image for post
Image for post

I replaced every . with \w and I wrapped the expression in \b so inner parts of words will be excluded, like in Panama. To support phrases let’s make it ignore spaces, tabs, commas, and quotes.

\b(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*(?:(\w)[ \t,'"]*\11?[ \t,'"]*\10|\10?)[ \t,'"]*\9|\9?)[ \t,'"]*\8|\8?)[ \t,'"]*\7|\7?)[ \t,'"]*\6|\6?)[ \t,'"]*\5|\5?)[ \t,'"]*\4|\4?)[ \t,'"]*\3|\3?)[ \t,'"]*\2|\2?))?[ \t,'"]*\1\b
Image for post
Image for post

I simply inserted a bunch of [ \t,’”]* expressions between every char outside of the capture groups, so those characters can exist in the string without needing to be mirrored on the other side.

And there we have it! Mission accomplished. They said it was impossible! … and they were right, it is impossible for a finite regular expression to detect palindromes of arbitrary length, but we can still make a limited one for fun!

Your homework assignment is to write a function that takes a length, and outputs a regex that can detect palindromes up to that length.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Sign up for Analytics Vidhya News Bytes

By Analytics Vidhya

Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Tony Tonev

Written by

http://tonythinks.com

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Tony Tonev

Written by

http://tonythinks.com

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store