RogerBW's Blog

Perl Weekly Challenge 21 15 August 2019

I've been doing the Perl Weekly Challenges. This one dealt with e, and URLs again.

The first challenge was to calculate e. There's a well-known expansion, so I used it.

use Math::BigFloat try => 'GMP';

my $a=0;
my $b=Math::BigFloat->new(1);
my $e=Math::BigFloat->new(1);

while (1) {
  $a++;
  $b/=$a;
  $e+=$b;
  print "$e\n";
}

Perl6 has arbitrary precision built in, with a ratio type to retain precision when the numerator and denominator become too different:

my $a=0;
my $b=FatRat.new(1,1);
my $e=Rat.new(1);

while (1) {
  $a++;
  $b/=$a;
  $e+=$b;
  print "$e\n";
}

I note that it runs distinctly more slowly than the Perl5 version, not surprising since it's not using the ferociously-optimised GMP library.

The other challenge was to apply URI canonicalisation, which meant revisiting challenge 17 (and slightly improving the parsing code there). I didn't have time to Perl-6-ify this one. I still haven't found a formal definition of what makes a valid URI and what doesn't; hey ho.

my $u=urlparse($url);

I wasn't going to write my own un-escaper, so:

use URI::Escape;
foreach my $mode (keys %{$u}) {
  if (exists $u->{$mode}) {
    $u->{$mode}=uri_unescape($u->{$mode});
  }
}

Make scheme and host lower case.

foreach my $mode (qw(scheme host)) {
  if (exists $u->{$mode}) {
    $u->{$mode}=lc($u->{$mode});
  }
}

Remove default ports (I'm sure this isn't the full set, but the first two cover the vast majority of ports in URIs anyway, and don't ask me how ftp on non-standard ports works because I don't know and don't care).

if (exists $u->{port} && exists $u->{scheme}) {
  if (my $dp={http => 80,
              https => 443,
              ftp => 21,
              smtp => 25,
              telnet => 22,
              ldap => 389,
              ldaps => 686,
            }->{$u->{scheme}}) {
    if ($dp==$u->{port}) {
      delete $u->{port};
    }
  }
}
print urlassemble($u),"\n";

The parser has had some minor upgrades.

sub urlparse {
  my ($url)=@_;
  my %match;
  if ($url =~ m!//!) {
    $url =~ m!^(?<scheme>.*?)://(?:(?:(?<userinfo>.*)@)?(?<host>[-._a-z0-9]+)(?::(?<port>[0-9]+))?)?(?<pqf>.*)!i;
    map {$match{$_}=$+{$_}} keys %+;
  } else { # if no userinfo-host-port component, split on the last colon
    $url =~ m!^(?<scheme>.*):(?<pqf>[^:]*)!;
    map {$match{$_}=$+{$_}} keys %+;
  }
  $match{pqf} =~ m!(?<path>[^?#]*)(?:\?(?<query>[^#]*))?(?:\#(?<fragment>.*))?$!;
  map {$match{$_}=$+{$_}} keys %+;
  delete $match{pqf};
  return \%match;
}

And the assembler simply puts the relevant punctuation back in.

sub urlassemble {
  my $u=shift;
  my $out=$u->{scheme}.':';
  if (exists $u->{host}) {
    $out.='//';
    if (exists $u->{userinfo}) {
      $out.=$u->{userinfo}.'@';
    }
    $out.=$u->{host};
    if (exists $u->{port}) {
      $out.=':'.$u->{port};
    }
  }
  $out.=$u->{path};
  if (exists $u->{query}) {
    $out.='?'.$u->{query};
  }
  if (exists $u->{fragment}) {
    $out.='#'.$u->{fragment};
  }
  return $out;
}

Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.

Search
Archive
Tags 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 3d printing action advent of code aeronautics aikakirja anecdote animation anime army astronomy audio audio tech aviation base commerce battletech beer boardgaming book of the week bookmonth chain of command children chris chronicle church of no redeeming virtues cold war comedy computing contemporary cornish smuggler cosmic encounter coup covid-19 crime cthulhu eternal cycling dead of winter doctor who documentary drama driving drone ecchi economics en garde espionage essen 2015 essen 2016 essen 2017 essen 2018 essen 2019 essen 2022 essen 2023 existential risk falklands war fandom fanfic fantasy feminism film firefly first world war flash point flight simulation food garmin drive gazebo genesys geocaching geodata gin gkp gurps gurps 101 gus harpoon historical history horror hugo 2014 hugo 2015 hugo 2016 hugo 2017 hugo 2018 hugo 2019 hugo 2020 hugo 2022 hugo-nebula reread in brief avoid instrumented life javascript julian simpson julie enfield kickstarter kotlin learn to play leaving earth linux liquor lovecraftiana lua mecha men with beards mpd museum music mystery naval noir non-fiction one for the brow opera parody paul temple perl perl weekly challenge photography podcast politics postscript powers prediction privacy project woolsack pyracantha python quantum rail raku ranting raspberry pi reading reading boardgames social real life restaurant reviews romance rpg a day rpgs ruby rust scala science fiction scythe second world war security shipwreck simutrans smartphone south atlantic war squaddies stationery steampunk stuarts suburbia superheroes suspense television the resistance the weekly challenge thirsty meeples thriller tin soldier torg toys trailers travel type 26 type 31 type 45 vietnam war war wargaming weather wives and sweethearts writing about writing x-wing young adult
Special All book reviews, All film reviews
Produced by aikakirja v0.1