RogerBW's Blog

Perl Weekly Challenge 111: Search Letters 05 May 2021

I’ve been doing the Perl Weekly Challenges. The latest involved search optimisations and word sorting. (Note that this is open until 9 May 2021.)

TASK #1 › Search Matrix

You are given 5x5 matrix filled with integers such that each row is sorted from left to right and the first integer of each row is greater than the last integer of the previous row.

Write a script to find a given integer in the matrix using an efficient search algorithm.

The optimisations are fairly clear: the target can only be in a row with first value <= target and last value >= target. If we get to a row that doesn't qualify on the first count, no future row will qualify either. (In the Raku version I did try simply flattening the input matrix and checking whether the search term was in it, to see if this wasn't premature optimisation, but that slowed things down by about 2.5×.)

sub sm {
  my ($matrix,$search)=@_;
  my $f=0;
  foreach my $row (@{$matrix}) {

Check for row validity:

    if ($row->[0] <= $search) {
      if ($row->[-1] >= $search) {

Then if the row is a potential candidate, shift it into a hash and check for key existence. I could have done a chop search on the individual row but at this size it doesn't seem worth it.

        if (exists {map {$_ => 1} @{$row}}->{$search}) {
          $f=1;
        }
        last;
      }
    } else {
      last;
    }
  }
  return $f;
}

Raku is much the same except that I don't need the hash; I can just use

        if ($search ∈ @row) {

and other languages are similar (Python if search in row, Ruby if row.include?(search), Rust if row.contains(&search)).

Python gets awkward because of the fall-through logic: if the current search row was valid based on first and last numbers, it's the only valid row in the list, so after we've tested for the presence of the target we want to drop out of the loop. But because I have no end-block indication apart from indenting, I find it quite unclear just what's going on.

    if row[0] <= search:
      if row[len(row)-1] >= search:
        if search in row:
          f=1
        break
    else:
      break

TASK #2 › Ordered Letters

Given a word, you can sort its letters alphabetically (case insensitive). For example, “beekeeper” becomes “beeeeekpr” and “dictionary” becomes “acdiinorty”.

Write a script to find the longest English words that don’t change when their letters are sorted.

The obvious approach is to take each word in a dictionary file, sort its letters, and compare with the original word. Some optimisation is possible by only examining words at least as long as the longest we've already got. Raku:

my $ml=0;
my @r;
for lines() {
  my $l=chars($_);
  if ($l >= $ml) {
    if ($_.comb.sort.join('') eq $_) {
      if ($l > $ml) {
        @r=();
        $ml=$l;
      }
      push @r,$_;
    }
  }
}

say $_ for @r;

and Perl and Ruby are similar. Note the latching system: if max is higher than current, clear the list and set max to current. Then, whether or not it was higher, push the new entry onto the list. This ends up with a list containing only the entries with the highest value for max, and is a pattern I find myself using quite a bit.

In Python and Rust I used a temporary array instead of re-joining the string, in Rust because there's no single operation to produce a new sorted vector from a vector, in Python because of the whole prefix-suffix ugliness that forces operations out of order:

            let ll: Vec<char>=line.chars().collect::<Vec<_>>();
            let mut ls=ll.clone();
            ls.sort();
            if ll == ls {

With OSPD2+ I get:

  • beefily (in a beefy manner)
  • billowy (swelling or swollen into large waves)

and SOWPODS adds:

  • addeems (adjudges, tries, tests)
  • chikors (partridges native to central Asia, Alectoris chukar)
  • dikkops (birds of the family Burhinidae)
  • gimmors (pieces of mechanism; mechanical devices or contrivances)

Full code on github.


  1. Posted by RogerBW at 01:34pm on 10 May 2021

    Part 1: some people did the inner search explicitly, but I suspect that using a language built-in (i.e. for everything except Perl) is probably faster. Haven't tested it, though. Given the size of the problem space, it's very hard to get meaningful benchmarks and I didn't bother.

    Possibly grep would have made more sense in Perl than the hash inversion, but I like hash inversions.

    Part 2: one could do this in parallel, but I think one would lose some of the efficiency gained by not sorting words which won't meet the length threshold. (But some answers didn't make this optimisation anyway.) Might give it a try some time when I fancy learning parallel coding in these various languages.

    Another optimisation is to sort the dictionary words into descending length before processing them. (Avoids processing any shorter words than the longest valid ones, but does require that the whole dictionary be held in memory.)

    An efficient approach: don't sort the word and compare with the original, but instead simply test the word for sortedness. I like that. Though again, with a mere 268,000 words even in SOWPODS it's unlikely to be a useful optimisation.

Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.

Search
Archive
Tags 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 2300ad 3d printing action advent of code aeronautics aikakirja anecdote animation anime army astronomy audio audio tech base commerce battletech bayern beer boardgaming book of the week bookmonth chain of command children chris chronicle church of no redeeming virtues cold war comedy computing contemporary cornish smuggler cosmic encounter coup covid-19 crime crystal cthulhu eternal cycling dead of winter doctor who documentary drama driving drone ecchi economics en garde espionage essen 2015 essen 2016 essen 2017 essen 2018 essen 2019 essen 2022 essen 2023 essen 2024 existential risk falklands war fandom fanfic fantasy feminism film firefly first world war flash point flight simulation food garmin drive gazebo genesys geocaching geodata gin gkp gurps gurps 101 gus harpoon historical history horror hugo 2014 hugo 2015 hugo 2016 hugo 2017 hugo 2018 hugo 2019 hugo 2020 hugo 2021 hugo 2022 hugo 2023 hugo 2024 hugo-nebula reread in brief avoid instrumented life javascript julian simpson julie enfield kickstarter kotlin learn to play leaving earth linux liquor lovecraftiana lua mecha men with beards mpd museum music mystery naval noir non-fiction one for the brow opera parody paul temple perl perl weekly challenge photography podcast politics postscript powers prediction privacy project woolsack pyracantha python quantum rail raku ranting raspberry pi reading reading boardgames social real life restaurant reviews romance rpg a day rpgs ruby rust scala science fiction scythe second world war security shipwreck simutrans smartphone south atlantic war squaddies stationery steampunk stuarts suburbia superheroes suspense television the resistance the weekly challenge thirsty meeples thriller tin soldier torg toys trailers travel type 26 type 31 type 45 vietnam war war wargaming weather wives and sweethearts writing about writing x-wing young adult
Special All book reviews, All film reviews
Produced by aikakirja v0.1