RogerBW's Blog: Perl Weekly Challenge 132: Hash on the Mirror

I’ve been doing the Weekly Challenges. The latest involved date calculations and inner joins. (Note that this is open until 3 October 2021.)

Task 1: Mirror Dates

You are given a date (yyyy/mm/dd).

Assuming, the given date is your date of birth. Write a script to find the mirror dates of the given date.

In other words, given "then" and an implicit "now", take the time span between them, and calculate the dates for that span before "then" and after "now".

In order to get test coverage, I built my functions to accept a second date parameter to define "today"; if the year is 0 it uses the present date instead. (Also, I didn't do the breakdown of a "yyyy/mm/dd" string into (y,m,d) values.)

Clearly the fiddly bit of this is using the language's date calculation functions; I need to generate a date object from (year, month, day), calculate the span between two objects, add or subtract that span to/from an object, and get the result out as (year, month, day) again. (For whatever form "date object" may take in the particular language.)

So we start with Perl. Time::Local is a core module (since perl 5), and localtime is built in.

sub mirdat {
  my $then=shift;

Get a "date object" (in this case an integer Unixtime representation of 00:00Z on that date, which may be negative if before 1970; some OSes can't deal with that).

  my $thent=ymd2ut($then);
  my $now=shift;

If necessary, set the input to present time; then build a date object of that. Yes, double conversion, because I want to make it midnight-UT not the time of day when it's being run.

  if ($now->[0]==0) {
    $now=ut2ymd(time);
  }
  my $nowt=ymd2ut($now);

The delta will be a number of seconds. (Ignoring leap seconds because Unixtime.)

  my $delta=$nowt-$thent;

For each target time, push the formatted version onto the output list.

  my @o;
  foreach my $targett ($thent-$delta,$nowt+$delta) {
    push @o,ut2ymd($targett);
  }
  return \@o;
}

Helper functions convert [y,m,d] to unixtime (noting the month offset; and if this is given a two-digit year it'll assume windowing round the present year, with no way to turn it off)…

sub ymd2ut {
  my $ta=shift;
  my ($y,$m,$d)=@{$ta};
  return timegm(0,0,0,$d,$m-1,$y);
}

…and unixtime back to [y,m,d] (nothing both the month offset and the damnsilly year offset which I think has been the cause of more bugs I've personally encountered than any other Perl language feature).

sub ut2ymd {
  my $ut=shift;
  my @t=gmtime($ut);
  return [$t[5]+1900,$t[4]+1,$t[3]];
}

So that's what you do if you don't really have date objects. Other languages manage better. Raku doesn't seem to have durations per se, but has a native date representation, and lets me convert it into a daycount which looks quite like a Julian (Scaliger) day number.

So we build a date with

  my $thent=Date.new(@then[0],@then[1],@then[2]);

get the difference with

  my $delta=$nowt.daycount-$thent.daycount;

and build an offset with e.g.

  $thent.earlier(day => $delta)

and get out [y,m,d] as

    [$targett.year,$targett.month,$targett.day];

Python and Ruby need an import but I think this is the equivalent of a core module feature; at least it was already installed. The Ruby version is basically identical to this Python: plain and simple calculations (note "*" to unwrap the array then into a parameter list).

  thent=date(*then)

[…]

  delta=nowt-thent

[…]

  thent-delta

[…]

  [targett.year,targett.month,targett.day])

For Rust we have the chrono library.

    let thnt=Utc.ymd(thn.0,thn.1,thn.2);

[…]

    let delta=nowt-thnt;

[…]

    thnt-delta

[…]

    (targett.year(),targett.month(),targett.day())

But what of PostScript, you ask? Surely you can't do it in…

/dmy2jd {
    /d exch def
    /m exch def
    /y exch def
    /mn m 14 sub 12 idiv def
    y 4800 add mn add 1461 mul 4 idiv
    mn 12 mul neg 2 sub m add 367 mul 12 idiv add
    y 4900 add mn add 100 idiv 3 mul 4 idiv sub
    d add
    32075 sub
} def

/jd2dmy {
    /y 4716 def
    /v 3 def
    /j 1401 def
    /u 5 def
    /m 2 def
    /s 153 def
    /n 12 def
    /w 2 def
    /r 4 def
    /B 274277 def
    /p 1461 def
    /C -38 def
    dup
    4 mul B add 146097 idiv 3 mul 4 idiv C add j add add /f exch def
    r f mul v add /e exch def
    e p mod r idiv u mul w add /h exch def
    /day h s mod u idiv 1 add def
    /month h s idiv m add n mod 1 add def
    /year e p idiv y sub n m add month sub n idiv add def
    year month day
} def

so there. Formulae are off the Wikipedia page I linked above; technically they give the Julian day at GMT-noon on the relevant date, and there's probably room for optimisation since they just assume a proleptic Gregorian calendar rather than mucking about with which country changed when.

The actual function is pretty obvious, though it turns out that at least under GhostScript there's a way to get the current date (it's not in the main PostScript language spec, because after all most printers don't need to know the current date)…

(%Calendar%) currentdevparams

will give you a handy dictionary with /Year, /Month and /Day defined. (Also /Hour, /Minute, /Second and /Weekday. If this is documented anywhere I haven't found it.)

Task 2: Hash Join

Write a script to implement Hash Join algorithm as suggested by wikipedia.

This turns out to be an inner join on a single key. Which is messy; I gave up on it in Rust when I bounced off the type system (I have to use a tuple because the types may not be identical, but I can only index a tuple with literal numbers), and didn't even try it in PostScript.

OK, so h[0] and h[1] are the original sets of tuples, i[0] and i[1] the index showing which element is the key.

sub hj {
  my @h;
  my @i;
  ($h[0],$i[0],$h[1],$i[1])=@_;

We're going to stuff everything into an actual hash, with the key values as its keys; values are lists of the values. So if a given key only turns up once in each set, that's easy. Say we'll be merging [k,va1] with [k,vb1]. The hash entry for k will look like

[
  [va1],
  [vb1]
]

If I also have [k,va2] in the first set, then it'll end up as

[
  [va1,va2],
  [vb1]
]

So here's how that happens…

  my %m;
  foreach my $x (0,1) {
    my $j=1-$i[$x];

But this is where Perl becomes horrifying, because building that hash looks like this:

    foreach my $y (0..$#{$h[$x]}) {
      push @{$m{$h[$x][$y][$i[$x]]}[$x]},$h[$x][$y][$j];
    }
  }

with deep autovivification to make the structure spring into life. And that's really Not Good. I mean, I don't like it, and I wrote it.

Anyway, for each key, to do the inner join I just iterate over each of those lists. In most cases there'll be just one entry in each.

  my @o;
  foreach my $k (sort keys %m) {
    foreach my $a (@{$m{$k}[0]}) {
      foreach my $b (@{$m{$k}[1]}) {
        push @o,[$a,$k,$b];
      }
    }
  }
  return \@o;
}

For the other languages I have less autovivification, so e.g. in Raku I have something like

      unless (%m{@h[$x][$y][@i[$x]]}:exists) {
        %m{@h[$x][$y][@i[$x]]} = [Array.new,Array.new];
      }
      push @(%m{@h[$x][$y][@i[$x]]}[$x]),@h[$x][$y][$j];

And Python looks like

      if not h[x][y][i[x]] in m:
        m[h[x][y][i[x]]]=[[],[]]
      m[h[x][y][i[x]]][x].append(h[x][y][j])

with Ruby about the same… but I have to say that's not really a whole lot better.

Full code on github.

Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.