Regex UCR: Generating different Regex dialects

How to keep Regex UCR’s UI simple, while supporting pretty much all the features Regex itself supports?

You might want to read Part 1 and Part 2 before this to understand what this is about.

Let’s get started with the simplest possible editing view.

Regex UCR has a default dialect called Simplified Regex, which is the smallest common denominator of all regex dialects.

When there is only stuff that works in all languages, we do not remind the user of the existence of dialects at all, allowing the user to concentrate on task at hand.

Editing existing regex

Ok, let’s say you already have a regex you want to edit. Say, you are identifying a decimal number in Python. Something like -3.53e4, a pretty small number.

Warning when selected dialect doesn’t support all features used

You paste it into the regex (text) area of Regex UCR. Your expression gets evaluated and translated automatically into the visual regex syntax.

We can’t handle named groups in the generic Simple Regex, so the editor

  • highlights unrecognized parts, and
  • asks you which dialect you would like to use now.

So you select Python in the dropdown.

Incompatibility situation resolved

Ok, now Regex UCR is in Python regex mode. The warning is removed from the UI.

You can now edit either, and the change gets reflected in the other.

Debugging your regex

Now you want to verify your regex actually matches everything you think it does, and none that it shouldn’t. How would you go on doing that?

There is a button on a toolbar that says “Test”. You click on it.

There is field to enter some sample text. Also, you can see a button that says “generate random text that matches”. So you do.

The app generates matches for the regex. In many cases you can see already from this that this isn’t quite what you needed, without having to actually think of different samples yourself. Because producing samples is boring.

Thanks to Byron Houwens and Sam Sullivan for the discussion that led to the design for multiple dialects.

If you think it’s your thing, to make more detailed mockups of how a great regex debugger will work, or even to implement that or the above, please join us!

Contact me to join our Slack channel if you want to work together on this. Our github repository is also now open. So far we have a React-based skeleton UI built. We warmly welcome any help bridging the gap from design to code.

See also: Part 4: Plans forward — licensing and architecture first steps

Main stories about regex UCR: Part 1, Part 2

--

--