Learning Perl via Roman Numerals

September 9, 2013

I’ve been noodling with some Perl foundations recently. I programme in Perl all of the time, but sometimes feel like I’ve gotten too used to doing things my way. As such, I’ve been spending some time going back over some Perl basics. One exercise that you keep finding on the web is the Roman Numeral conversion problem. It’s rather simple: Convert a roman numeral into decimal.

I really love this one for a couple of reasons:

  1. Most people understand roman numerals to an extent
  2. Most people don’t fully understand roman numerals, so there’s a bit of fun learning it
  3. It’s a simple input/output processing task, perfect for Perl
  4. It’s easy to test.

Making it with modules

There’s no reason not to do this with modules. Also, it makes unit testing far easier. Speaking of which, let’s create the module first, then write some tests, then start programming to satisfy the tests.

Create the directory structure:

$ mkdir -p t lib/Roman

And edit the file lib/Roman/Numeral.pm to look like this:

package Roman::Numeral;

use strict;
use Carp;

sub numeral {
    return -1; #always fail
}

sub import {
    no strict 'refs';
    my ( $package, $file, $line ) = caller;
    shift; # get rid of the package name
    my @args = @_;

    for my $method ( qw ( numeral ) ){
       *{$package . "::$method"} = \&$method
           if grep { /$method/ } @args;
    }
}

1;

This will create a package named Roman::Numeral, and (optionally) export a function called numeral into the main namespace.

Now, create the file t/001.t:

#!/usr/bin/perl
use strict;
use Test::More;
use lib 'lib';
use Roman::Numeral 'numeral';

ok( numeral( 'I'    ) == 1,  'I'    );
ok( numeral( 'II'   ) == 2,  'II'   );
ok( numeral( 'III'  ) == 3,  'III'  );
ok( numeral( 'IV'   ) == 4,  'IV'   );
ok( numeral( 'V'    ) == 5,  'V'    );
ok( numeral( 'VI'   ) == 6,  'VI'   );
ok( numeral( 'VII'  ) == 7,  'VII'  );
ok( numeral( 'VIII' ) == 8,  'VIII' );
ok( numeral( 'IX'   ) == 9,  'IX'   );
ok( numeral( 'X'    ) == 10, 'X'    );

done_testing();

and let’s make a Makefile:

test:
    PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t

Alright, if we run ‘make’ now, it’ll fail. Which is good. We’ll programme until it succeeds.

Numeral Mapping

Alright, let’s create a set/hash in our module that maps single numerals to decimals. Place this in lib/Roman/Numeral.pm above the numeral function and let’s flesh out the numeral function a bit.

my $MAP = {
    M  => 1000,
    D  => 500,
    C  => 100,
    L  => 50,
    X  => 10,
    V  => 5,
    I  => 1,
};

sub numeral {
    my ( $numeral ) = @_;

    my @numerals = split // => $numeral;

    @numerals = map { $MAP->{$_} } @numerals;

    use Data::Dumper;
    print STDERR Dumper( \@numerals ),"\n";

    return -1;
}

Running make will still fail, but it’s a start. You’ll see that it’s printing out the mapped array of decimal values provided. Now, since roman numerals are additive, we’ll sum them up. Replace the broken -1 return line with:

# Now sum them
    my $sum = 0;
    $sum += $_ for @numerals;
    return $sum;

Ok, we failed two tests. So, we’re getting there. There’s a lot of noise, so let’s quieten it down a bit with a VERBOSE check.

At the top, add:

package Roman::Numeral;

use strict;
use Carp;

my $VERBOSE = $ENV{VERBOSE} or 0;

and change the print statement to check the $VERBOSE variable:

use Data::Dumper;
print STDERR Dumper( \@numerals ),"\n"
    if $VERBOSE;

Now, if we want to see the verbose output, we type:

 $ VERBOSE=1 make

Alright, we see that we’re failing:

#   Failed test 'IV'
#   at t/001.t line 10.

#   Failed test 'IX'
#   at t/001.t line 15.

which makes sense, as they’re our special cases. Roman Numerals are annoying this way. To avoid having four consecutive numerals (IIII or VIIII) they’ve replaced them with VI and IX respectively. Let’s write a normalize routine to handle these.

sub normalize {
    my ( $numeral ) = @_;

    # the numeral I can be placed before V and X to make 4 units (IV) and 9 units (IX) respectively
    # X can be placed before L and C to make 40 (XL) and 90 (XC) respectively
    # C can be placed before D and M to make 400 (CD) and 900 (CM) according to the same pattern[5]

    while( $numeral =~ s/CM/CCCCCCCCC/g ){}
    while( $numeral =~ s/CD/CCCC/g ){}
    while( $numeral =~ s/XL/XXXX/g ){}
    while( $numeral =~ s/XC/XXXXXXXXX/g ){}
    while( $numeral =~ s/IV/IIII/g ){}
    while( $numeral =~ s/IX/VIIII/g ){}

    return $numeral;
}

What is this doing? I’ve pasted some text from the wikipedia article showing how to reduce certain numbers. We use a regular expressions with the “/g” flag within a while loop to replace each occurance with a more machine readable version. You’ll see that IV becomes IIII, IX becomes VIIII, etc. I’ve done the same for CM, CD, XL and XC.

Now, reference the normalize function within the numeral function:

my @numerals = split // => normalize( $numeral );
$ make
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/001.t .. ok    
All tests successful.
Files=1, Tests=10,  0 wallclock secs ( 0.02 usr  0.01 sys +  0.03 cusr  0.00 csys =  0.06 CPU)
Result: PASS

Hurray! But, we can do more. We should also be validating the input. Let’s create a validation call within the normalize call to ensure that the values being passed in are valid.

sub validate {
    my ( $numeral ) = @_;

    print STDERR "Validating [$numeral]\n"
        if $VERBOSE;

    croak 'Numeral contains invalid characters'
        unless $numeral =~ /^[MCDLXVI]+$/;

    croak 'Not a valid numeral'
        unless( $numeral =~ /(M)*(CM)?(D)*(CD)?(C)*(XC)?(L)*(X)*(IX)?(V)*((IV)?I{,3})?/ );

}

sub normalize {
    my ( $numeral ) = @_;

    validate( $numeral );

Presto! We’re done.

Continued Testing

We can now start fleshing out the testing to also test the normalization and validation routines, but I’ll leave this here.

Get The Code

A copy of the code is on github, with slightly modified files. But, the meat of it is exactly the same.

Discussion, links, and tweets

Follow me on Twitter