RogerBW's Blog

The Weekly Challenge 166: Hexing the Directories 26 May 2022

I’ve been doing the Weekly Challenges. The latest involved word searching and directory mangling. (Note that this is open until 29 May 2022.)

Task 1: Hexadecimal Words

Write a program that will read from a dictionary and find 2- to 8-letter words that can be "spelled" in hexadecimal, with the addition of letter substitutions (O = 0, I or L = 1, S = 5, T = 7)

Optional extras:

Limit the number of "special" letter substitutions in any one result to keep that result at least somewhat comprehensible. (0x51105010 is an actual example from my sample solution you may wish to avoid!)

Find phrases of words that total 8 characters in length (e.g., 0xFee1Face), rather than just individual words.

I decided to roll the maximum-specials into the main function. To check my test cases, I also did a shell version:

"words of 2-8 letters, up to 8 specials":

$ egrep -i "^[abcdefoilst]{2,8}$" dictionary.txt|wc -l
1463

"words of 8 letters, up to 8 specials":

$ egrep -i "^[abcdefoilst]{8}$" dictionary.txt|wc -l
164

"words of 2-8 letters, no specials":

$ egrep -i "^[abcdef]{2,8}$" dictionary.txt|wc -l
45

"words of 2-8 letters, at most 1 special":

$ egrep -i "^[abcdefoilst]{2,8}$" dictionary.txt|grep -v "[oilst].*[oilst]"|wc -l
244

So that can be done with a three-parameter function: minimum length, maximum length, maximum specials. In Rust:

fn hexwords(lo: usize, hi: usize, sb: usize) -> Vec<String> {
    let mut out: Vec<String> = Vec::new();
    let file = File::open("dictionary.txt").unwrap();
    let reader = BufReader::new(file);
    for lx in reader.lines() {
        let line = lx.unwrap();

Filter lines to an appropriate length.

        if line.len() >= lo && line.len() <= hi {
            let mut valid = true;
            let mut sbc = 0;

Check each character: specials increment the count and may cause an early exist, disallowed characters cause an early exit, but if nothing caused an exit add the word to the output list.

            for c in line.chars() {
                if c == 'o' || c == 'i' || c == 'l' || c == 's' || c == 't' {
                    sbc += 1;
                    if sbc > sb {
                        valid = false;
                    }
                } else if c < 'a' || c > 'f' {
                    valid = false;
                }
                if !valid {
                    break;
                }
            }
            if valid {
                out.push(line);
            }
        }
    }
    out
}

For "phrases of words", I ended up using a cartesian product (cross product). This takes the output from hexwords and sorts into lists by length:

fn combiwords(wl: Vec<String>, l: usize) -> Vec<String> {
    let mut wh: HashMap<usize, Vec<String>> = HashMap::new();
    for w in wl {
        let en = wh.entry(w.len()).or_insert(Vec::new());
        (*en).push(w);
    }

Then we build a list of possible length decompositions: for example, if we have words of length 3, 4 and 5, we can build an 8-letter phrase out of (3,5), (4,4) or (5,3).

    let mut tmap: Vec<Vec<usize>> = vec![Vec::new()];
    let mut omap: Vec<Vec<usize>> = Vec::new();
    while tmap.len() > 0 {
        let mut c = tmap.pop().unwrap();
        let s = &c.iter().sum::<usize>();
        let ls = l - s;
        for j in 1..ls {
            if wh.contains_key(&j) {
                let mut cc = c.clone();
                cc.push(j);
                tmap.push(cc);
            }
        }
        if wh.contains_key(&ls) {
            c.push(ls);
            omap.push(c);
        }
    }

Then, for each length combination, do a cartesian product of each of the lists that make it up, to produce each possible combination. In Rust that's in the Itertools crate; in Raku I can use the X cross-product operator (repeatedly, because it only takes two parameters); in Python and Ruby it's product; and in the other five languages I wrote my own (the PostScript version of which is now available in my PostScript libraries).

    let mut out: Vec<String> = Vec::new();
    for pat in omap {
        for ss in pat.iter().map(|i| &wh[i]).multi_cartesian_product() {
            out.push(ss.iter().join(""));
        }
    }
    out
}

Task 2: K-Directory Diff

Given a few (three or more) directories (non-recursively), display a side-by-side difference of files that are missing from at least one of the directories. Do not display files that exist in every directory.

Since the task is non-recursive, if you encounter a subdirectory, append a /, but otherwise treat it the same as a regular file.

The actual processing is the relatively easy bit; the hard part for me was reading directories across the various languages I'm using. Lua needs an external library to do this, so I left it out this time.

In Perl:

Signatures (i.e. named function parameters). They didn't have those when I were a lad.

sub kdd(@dirlist0) {
  my @dirlist = sort @dirlist0;
  my %fx;
  foreach my $d (@dirlist) {

Modern Perl puts dirhandles in proper variables.

    opendir (my $dh,$d);

We don't want dotfiles (I arbitrarily assume), but we do want to detect subdirectories and note them.

    foreach my $entry (grep !/^\./,readdir $dh) {
      my $nn = $entry;
      if (-d "$d/$entry") {
        $nn .= '/';
      }
      $fx{$nn}{$d} = 1;
    }
    closedir $dh;
  }

%fx is an inside-out version of the data: a hash of filenames, each of which contains a hash (set, in languages that support it) indicating which directories it turns up in.

  my $mm=scalar @dirlist;
  my @out=(\@dirlist);

For each file, skip it if it's in all the directories.

  foreach my $f (sort keys %fx) {
    unless (scalar keys %{$fx{$f}} == $mm) {

Otherwise build up an output line: the filename if it's present, a blank if it's not.

      my @l;
      foreach my $d (@dirlist) {
        if (exists $fx{$f}{$d}) {
          push @l,$f;
        } else {
          push @l,'';
        }
      }
      push @out,\@l;
    }
  }
  return \@out;
}

That gives a data structure, which then gets printed in a fixed-width format. I already had code to do this in Perl:

sub tabular($d) {
  my @columnlength;
  foreach my $row (@{$d}) {
    foreach my $colno (0..$#{$row}) {
      if (!defined($columnlength[$colno]) ||
          $columnlength[$colno] < length($row->[$colno])) {
        $columnlength[$colno]=length($row->[$colno]);
      }
    }
  }
  my $format=join(' | ',map {"%-${_}s"} @columnlength);
  my $result='';
  foreach my $row (@{$d}) {
    $result .= sprintf($format,@{$row})."\n";
  }
  return $result;
}

I didn't think PostScript could do this at all, but it seems that it can, in a rather baroque way. (If it weren't baroque, I wouldn't love it so.) Look up filenameforall in the Red Book…

Full code on github.

Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.

Search
Archive
Tags 1920s 1930s 1940s 1950s 1960s 1970s 1980s 1990s 2000s 2010s 3d printing action advent of code aeronautics aikakirja anecdote animation anime army astronomy audio audio tech aviation base commerce battletech beer boardgaming book of the week bookmonth chain of command children chris chronicle church of no redeeming virtues cold war comedy computing contemporary cornish smuggler cosmic encounter coup covid-19 crime cthulhu eternal cycling dead of winter doctor who documentary drama driving drone ecchi economics en garde espionage essen 2015 essen 2016 essen 2017 essen 2018 essen 2019 essen 2022 essen 2023 existential risk falklands war fandom fanfic fantasy feminism film firefly first world war flash point flight simulation food garmin drive gazebo genesys geocaching geodata gin gkp gurps gurps 101 gus harpoon historical history horror hugo 2014 hugo 2015 hugo 2016 hugo 2017 hugo 2018 hugo 2019 hugo 2020 hugo 2022 hugo-nebula reread in brief avoid instrumented life javascript julian simpson julie enfield kickstarter kotlin learn to play leaving earth linux liquor lovecraftiana lua mecha men with beards mpd museum music mystery naval noir non-fiction one for the brow opera parody paul temple perl perl weekly challenge photography podcast politics postscript powers prediction privacy project woolsack pyracantha python quantum rail raku ranting raspberry pi reading reading boardgames social real life restaurant reviews romance rpg a day rpgs ruby rust scala science fiction scythe second world war security shipwreck simutrans smartphone south atlantic war squaddies stationery steampunk stuarts suburbia superheroes suspense television the resistance the weekly challenge thirsty meeples thriller tin soldier torg toys trailers travel type 26 type 31 type 45 vietnam war war wargaming weather wives and sweethearts writing about writing x-wing young adult
Special All book reviews, All film reviews
Produced by aikakirja v0.1