RogerBW's Blog: The Weekly Challenge 350: A Good Shuffle

I’ve been doing the Weekly Challenges. The latest involved string fragments and numerical anagrams. (Note that this ends today.)

Task 1: Good Substrings

You are given a string.

Write a script to return the number of good substrings of length three in the given string.

A string is good if there are no repeated characters.

So we take each three-character slice and check for repeated characters. In Perl:

sub counterify($a) {
  my %m;
  foreach my $i (@{$a}) {
    $m{$i}++;
  }
  return \%m;
}

sub goodsubstrings($a) {
  my $p = 0;
  foreach my $si (0 .. length($a) - 3) {
    my $c = counterify([split '', substr($a, $si, 3)]);
    if (max(values %{$c}) == 1) {
      $p++;
    }
  }
  $p;
}

In most other languages, there's a sliding window function to get the subsets, and there may even be a counter class as in Rust:

fn goodsubstrings(a: &str) -> usize {
    let mut p = 0;
    for s in a.chars().collect::<Vec<_>>().windows(3) {
        let c = s.iter().collect::<Counter<_>>();
        if *c.values().max().unwrap() == 1 {
            p += 1;
        }
    }
    p
}

Task 2: Shuffle Pairs

If two integers A <= B have the same digits but in different orders, we say that they belong to the same shuffle pair if and only if there is an integer k such that A = B * k. k is called the witness of the pair.

For example, 1359 and 9513 belong to the same shuffle pair, because 1359 * 7 = 9513.

Interestingly, some integers belong to several different shuffle pairs. For example, 123876 forms one shuffle pair with 371628, and another with 867132, as 123876 * 3 = 371628, and 123876 * 7 = 867132.

Write a function that for a given $from, $to, and $count returns the number of integers $i in the range $from <= $i <= $to that belong to at least $count different shuffle pairs.

PS: Inspired by a conversation between Mark Dominus and Simon Tatham at Mastodon.

This ambiguous question has to be resolved from the examples: what we're actually counting is integers that are the lowest members of at least n shuffle pairs. And there are two ways to do this: work out all permutations of the integer and see if they're multiples, or work out all multiples and see if they're permutations. The latter is faster. In Crystal:

Standard counterifying routine.

def counterify(a)
  cc = Hash(Char, Int32).new(default_value: 0)
  a.each do |x|
    cc[x] += 1
  end
  return cc
end

Do the type conversion to counterify an integer.

def countdigits(a)
  counterify(a.to_s.chars)
end

def shufflepairs(low, high, pairs)
  total = 0

Iterate over the range.

  low.upto(high) do |candidate|

Count digits in the candidate.

    candidatec = countdigits(candidate)
    cnt = 0

Iterate over possible multipliers. (×10 or more will necessarily be longer than the original. And because we only want the lowest in a pair, we don't have to worry about numbers smaller than the candidate.)

    2.upto(9) do |mul|

Get the potential pair, and count its digits.

      test = candidate * mul
      testc = countdigits(test)

If we have a match, increment the count of pairs for this number. If we have enough, drop out. (This probably doesn't save much, but one might as well.)

      if testc == candidatec
        cnt += 1
        if cnt >= pairs
          break
        end
      end
    end

If we have enough pairs, increment the total.

    if cnt >= pairs
      total += 1
    end
  end
  total
end

Time for another lot of timings, since I haven't done one of these for a while. I measured three successive runs of all test cases on an unloaded system; the fastest is given here. (I think Rust parallelises tests, but within the same process; measured load was fairly solidly at 1.0 anyway.) Where possible, I'm leaving out compilation time; I assume a program will be run significantly more often than it's written. My Scala setup is frankly fragile, and I wasn't able to do that for Scala, so it should potentially be further up the chart.

Language functionality may be a consideration here too; the languages I've noted with "#" don't have a direct hash equality comparator (or their documentation doesn't make it clear that they do; Raku in particular has all sorts of things I simply can't find in the docs), so I had to write a function for it in that language. Only a shallow comparison is needed.

(In PostScript I found a bug in my existing deepeq deep comparator, which was leaking values onto the stack. But a custom shallow comparator function, which I wrote as part of the bug-finding process and which doesn't recurse to compare individual values, is significantly faster.)

Language	time/s	RSS/kB
Rust (release)	2.22	2820
Kotlin	3.08	211552
Crystal (release)	5.44	5328
JavaScript (node) #	10.68	111704
Scala (w/ compile)	19.61	279716
Python	24.51	15076
Rust (debug)	39.97	2708
Crystal (debug)	40.91	7372
Perl #	51.57	13612
Ruby	57.29	23748
Lua #	89.66	3056
PostScript # custom	132.37	30808
PostScript # deepeq	151.81	30724
Raku #	214.85	179656

A note for future Roger:

/usr/bin/time --format="| %C | %e | %M |" /path/to/binary

Typst failed to produce a result and went into a runaway high load state, though the code works for the smaller and more manageable test cases.

I'm particularly impressed with Rust's memory usage, given the compromises Lua makes to its language in the name of low footprint: here's something much faster and more capable, and (in this admittedly unusual case) it uses less memory.

Full code on github.

RogerBW's Blog

Add A Comment