Perl 6 small stuff #17: a weekly challenge of Big Pi’s, Bags and modules

I didn’t have the time to solve last week’s challenge, but easter has started with slower days — so I decided to try my hand at the Perl Weekly Challenge number 4.

In my opinion, exercise 1 and 2 was not beginner vs advanced this time; both were peculiar. The first exercise was this:

Write a script to output the same number of PI digits as the size of your script. Say, if your script size is 10, it should print 3.141592653.

Thinking about this I thought that Perl 5 seemed to be easiest this time, as the Math::BigFloat package has the method bpi that returns PI to the given precision (i.e. bpi(10) returns 3.141592654). All I’d have to do was to figure out the file size of the script itself and return PI to that precision. I.e. a Perl 5 answer would look something like this:

#!/usr/bin/env perl
use v5.18;
use Math::BigFloat 'bpi';
say bpi(-s $ARGV[0]); 

I’m uncertain as to whether the size of the script means the number of characters in the script file, or whether it is the size of the script in bytes (in a unicode world those two aren’t necessarily identical). I choose to believe it’s the script size in bytes we’re talking about.

Hadn’t it been for the fact that Perl6 do not have — as far as I know — an equivalent to bpi built-in, a Perl 6 answer must implement such a functionality. But: If I put that code into the script itself, the answer would probably be so long that it exceeded the numbers of digits of PI a bpi implementation like Perl 5’s could return.

So this gave me an excellent opportunity to introduce modules as a part of the solution.

Script: PWC004-01.p6
Usage: perl6 -I. PWC-004-01.p6
Output: 3.141592653589793238462643383279502884197169399375105820974945
#!/usr/bin/env perl6
use BigPI;
say BigPI::pi $?FILE.IO.s;

A couple of fun things about this: $?FILE referes to the script file itself. If you prefer a more self explanatory variant, $*PROGRAM-NAME can be used instead of $?FILE.

.IO.s returns the size of the file in bytes and, as mentioned above, seems to be what the exercise calls for. However, see note [1] below if you’d rather want the size to be the number of characters instead. And if you’ve mixed unicode into your Perl 5 script, see [2] for some Perl 5 specific notes.

Anyway, the module referenced here is stored in a separate file, and looks like this.

Module: BigPI.pm6
# Place in script directory and use perl6 with -I flag
# i.e. perl6 -I. <calling script>
unit module BigPI;
# This definition of PI is borrowed from Perl 5's Math::BigFloat
constant PI = join '', qw:to/END/;
314159265358979323846264338327950288419716939937510582097494459230781640628
620899862803482534211706798214808651328230664709384460955058223172535940812
848111745028410270193852110555964462294895493038196442881097566593344612847
564823378678316527120190914564856692346034861045432664821339360726024914127
372458700660631558817488152092096282925409171536436789259036001133053054882
046652138414695194151160943305727036575959195309218611738193261179310511854
807446237996274956735188575272489122793818301194912983367336244065664308602
139494639522473719070217986094370277053921717629317675238467481846766940513
200056812714526356082778577134275778960917363717872146844090122495343014654
958537105079227968925892354201995611212902196086403441815981362977477130996
051870721134999999837297804995105973173281609631859502445945534690830264252
230825334468503526193118817101000313783875288658753320838142061717766914730
359825349042875546873115956286388235378759375195778185778053217122680661300
192787661119590921642019893809525720106548586327886593615338182796823030195
END
our sub pi(Int $precision where * <= PI.chars = 10) {
return "3." ~ PI.substr(1, $precision - 2) ~ round(PI.substr($precision - 1, 2) / 10);
}

The first line, unit module BigPI;, tells the interpreter that everything that follows is a part of this single module. If I had wanted to put several modules into one file, I’d define them with module NAME { …content… }.

To avoid collision with the built-in pi, I used the our sub statement. This means that you have to refer to the method using the full name, i.e. BigPi::pi. You could have chosen to export pi instead (is export), which would have let you refer to pi directly. But I prefer the other version to avoid name space collisions.

This tiny module also showcases a couple of other Perl 6 specific things. One, I use Perl 6’s built-in mechanics for defining constraints within the sub routine signature. In this case I tell the Perl 6 compiler to not allow precision higher than the number of digits in the PI constant (where * <= PI.chars). If you call the method with a higher precision, an error will be thrown. In addition I define a default precision of 10 should you want to call BPI::pi without an argument (=10 in the end).


The second exercise was this:

You are given a file containing a list of words (case insensitive 1 word per line) and a list of letters. Print each word from the file than can be made using only letters from the list. You can use each letter only once (though there can be duplicates and you can use each of them once), you don’t have to use all the letters.

I have to admit I had to read this several time before understanding it, and I have a feeling that I may have even now. But I read it like this: Every word in the file is to be checked against a list of letters. That list can contain duplicates. If the word “abba” is checked against a list <a a a a b b b c d d e>, two a’s and two b’s will be removed from the list, so that what’s left is <a a b c d d e> before checking that against the next word. If the next word is “be”, the list is reduced to <a a c d d>. Had the word been bee, however, there would be no match and the list would still be <a a b c d d e> when checking the next word.

#!/usr/bin/env perl6
my $letters = ( gather for 1..500 { take ('a'..'z').pick } ).BagHash;
say ([+] $letters.values) ~ " letters matches...";
for "random-2000.dict".IO.lines.sort({ rand }) -> $word {
my $wbh = $word.lc.comb.BagHash;
if ($wbh (<=) $letters) {
$letters = $letters (-) $wbh;
say "\t" ~ $word;
}
}
say ([+] $letters.values) ~ " letters remain.";

This script assumes there a file called “random-2000.dict” in the working directory, containing a list of words (I have a list of 2000 random words in that file). It also assumes, perhaps naively, that the words are made of the letters a to z, i.e. standard english.

I start the script by generating a list of 500 characters, a random selection of a to z’s. You’ve seen me use gather and take before, so I won’t explain that here again. Rather I’d like to point out that the immediately, because I convert the list to a BagHash. A BagHash is a short and simple version of stuff you’ve programmed manually in Perl 5 before. The letters a to z are keys of this hash, and the value for each key is the number of occurences of that letter (if the value is zero, the key is removed).

I loop trough a randomly sorted list of words, and converts each word into a BagHash of its own. And now I can start to use some really useful infixes that works on BagHashes. In the statement if ($wbh (<=) $letters) the infix (<=) compares the word’s BagHash with the big bag of letters. If the word is contained by or equal to the bag of letters, we have a matching word and can print it to screen. At the same time I remove the letters of the word from the bag of letters by using the infix (-). So that a word <a b b a> removed from the list of letters <a a a a b b b c d d e> would reduce that list to <a a b c d d e> before testing the next word. The infix saves us from writing a lot of code here!

I’m glad the exercise stated that I didn’t have to use all of the letters, because I never seem to be able to. Every time I run this script a couple of hundred letters remains.

You should also note the use of the prefix [+] in this code. Used on a list/an array, it sums all of the elements of that list. It’s not strictly necessary to use it here; I use it only to output some information about the list of letters before and after the run.

Here is the output of the script when I run it:

500 letters matches
ingrainedly
regarding
trichinopoly
solenoidal
prophloem
beggary
paravertebral
tapinophobia
awarder
isocratic
mucic
tox
handwheel
suasible
skiptail
hippoboscid
colopexy
undeviating
onetime
stiped
housekeep
cocci
adjudgment
genys
Boche
fletch
nunky
l
huck
whuff
smug
hut
Lulu
259 letters remain.

This was a great challenge. As simple as they may seem at first glance, they got me to use several Perl 6 specific techniques. It was a great way to “hone my skills” so to speak.


NOTES

[1] If size is means as “the size of the file in bytes”, “filename”.IO.s returns exactly what you want. But as mentioned above, files with Unicode characters in them can potentially have several times more bytes than characters.

A file containing a single character — a — is one byte long. So here the number of bytes and characters are equal. But another file containing this single character — 🕹 — is four bytes long.

So which is it? What represents “size” most correct? I’m not sure. I guess bytes is the safe bet. But should you want to count characters and use that number for PI precision, do this instead:

say BigPI::pi chars slurp $?FILE;

There’s more than one way to do it. And interpret it I guess.

[2] If you’re programming Perl 5, you’ve got similar issues. -s "filename" returns the size of a file in bytes while length($variable) returns the number of characters. Or so you should think. But consider this script:

#!/usr/bin/perl

use v5.18;

$a = "🕹";
$b = "j";

say length($a);
say length($b);

One would think that both statements returns 1. But in this case the first say prints 4 and the second prints 1. It’s as if length returns bytes and not characters. Why?

Perl 5 has had some level of unicode support since v5.6 from March 2000 (though, I’m sorry to say, feels like an afterthought even in the latest development versions 19 years later). You can turn much of the unicode support on by, for instance, calling the perl interpreter with the -C flag. But you’ll have to tell the interpreter even more explicitly if your script file itself includes utf8 text (Perl 5 assumes no utf8 by default).

The above script does include utf8, so add a use utf8; at the start of the script to force perl to interpret utf8 characters as characters, not bytes. Now both say statements will print 1 as expected. Rather unexpectedly, perhaps, you even get the possibility to use utf8 letter characters in sub routine and scalar names when you’ve added the use statement in the beginning.

#!/usr/bin/perl

use v5.18;
use utf8;

my $a = "🕹";
my $b = "j";
my $珠 = "perl utf8"; # Throws an error without 'use utf8'

say length($a);
say length($b);
say $珠;

Not quite Perl 6 level yet, but it’s getting there.