Perl small stuff #8: An error in Range? More about the “AAA”..”ABS” strangeness — and a proposed solution

My last small stuff article was about a peculiar difference between how Perl 5 and Perl 6 generates ranges, more specifically how the range “AAAAAAA”..”ABRAXAS” generates 19 million+ elements where Perl 6 generates a modest 16.416. With the help of my readers I figured out that Perl 5 treats the range as a Base 26 number (i.e. a number system where A = 0 and Z = 25). Perl 6, however, creates a range by “counting” from right to left, iterating each character in “AAAAAAA” from A to whatever character in “ABRAXAS” it matches. I.e. “A[A-B][A-R]A[A-X]A[A-X]”, starting at the rightmost A and working its way leftwards.

I think what follows first and foremost expose that I haven’t understood how the smartmatch operator actually works. But the spirit of this blog is to also showcase my misunderstandings, so here we go.

In the aftermath of that, I had a discussion with Ali Elshishini, where I created a small snippet of code using the smartmatch operator in Perl 6. We were trying to use smartmatch to check whether AAAAOYM belonged to the range. It doesn’t, but Perl 6 said it did. What gives?

This is easier to understand when you see it:

$ perl6 -e 'say ("AAAAAAA".."ABRAXAS").grep("AAAAOYM"); say "AAAAOYM" ~~ "AAAAAAA".."ABRAXAS";'
()    # result of the grep: AAAAOYM is not in the range
True # ...but the smartmatch operator says it is

It is definitively an inconsistency here. For all I know it’s on purpose, but if it is it’d sure be interesting to know why.

But strangely… if you convert the range to an array, the inconsistency goes away.

$ perl6 -e 'my @a = "AAAAAAA".."ABRAXAS"; say @a.grep("AAAAOYM");'
()     # result of the grep: AAAAOYM is not in the range

My gut feeling is that the smartmatch operator used on a range should work approximately the same way that the .grep does (@loltimo pointed out to me on Twitter that used on arrays and lists the smartmatch operator don’t look for membership but equivalence).

If so, the problem lies in the Range class itself. I’m not an expert in the inner workings of the Rakudo Perl 6 code. But can it be that the problem lies in the lines 378–381 of Range class source code (version from August 25, 2018)? Here is what it says:

multi method ACCEPTS(Range:D: Mu \topic) {
(topic cmp $!min) > -(!$!excludes-min)
and (topic cmp $!max) < +(!$!excludes-max)

Seing this I think the whole debacle has to do with the cmp operator. Not the operator itself but with how it’s used. The cmp operator does, I believe, an alphabetic comparison. In the above method it’s used to compare topic with the minimum and maxium values of the range.

To simplify the whole thing, let’s say what we were comparing a string to was the range “AAAA”..”AXAS”.

AOYM is not a part of that range. But if we, as the method above, use cmp to compare against the minimum and maximum value of the range, you’d actually believe it is:

$ perl6 -e 'say "AOYM" cmp "AAAA"; say "AOYM" cmp "AXAS";'

When viewed isolation, the answer is that AOYM is less than AXAS and more than AAAA. I.e. that AOYM is within the range. But as we now — in the way that Perl 6 computes a range, AOYM is not a part of the range (in a way ~~ treats the range as if it was Perl 5’s range). The method would have to check “AOYM” for equality to every single element in the range, for the smartmatch operator to work as expected.

So what to do? Convert to array/list and use grep.

Conclusion? As obscure as this problem is I guess it won’t pop up on anyone’s radar soon. But hopefully is this interesting for one or two people other than me out there :-)

Like what you read? Give Jo Christian Oterhals a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.