I’ve been doing the Weekly
Challenges. The
latest
involved string fragments and numerical anagrams. (Note that this ends
today.)
Task 1: Good Substrings
You are given a string.
Write a script to return the number of good substrings of length
three in the given string.
A string is good if there are no repeated characters.
So we take each three-character slice and check for repeated
characters. In Perl:
sub counterify($a) {
my %m;
foreach my $i (@{$a}) {
$m{$i}++;
}
return \%m;
}
sub goodsubstrings($a) {
my $p = 0;
foreach my $si (0 .. length($a) - 3) {
my $c = counterify([split '', substr($a, $si, 3)]);
if (max(values %{$c}) == 1) {
$p++;
}
}
$p;
}
In most other languages, there's a sliding window function to get the
subsets, and there may even be a counter class as in Rust:
fn goodsubstrings(a: &str) -> usize {
let mut p = 0;
for s in a.chars().collect::<Vec<_>>().windows(3) {
let c = s.iter().collect::<Counter<_>>();
if *c.values().max().unwrap() == 1 {
p += 1;
}
}
p
}
Task 2: Shuffle Pairs
If two integers A <= B have the same digits but in different
orders, we say that they belong to the same shuffle pair if and only
if there is an integer k such that A = B * k. k is called the
witness of the pair.
For example, 1359 and 9513 belong to the same shuffle pair, because
1359 * 7 = 9513.
Interestingly, some integers belong to several different shuffle
pairs. For example, 123876 forms one shuffle pair with 371628, and
another with 867132, as 123876 * 3 = 371628, and 123876 * 7 = 867132.
Write a function that for a given $from, $to, and $count
returns the number of integers $i in the range $from <= $i <= $to that belong to at least $count different shuffle pairs.
PS: Inspired by a conversation between Mark Dominus and Simon Tatham at Mastodon.
This ambiguous question has to be resolved from the examples: what
we're actually counting is integers that are the lowest members of
at least n shuffle pairs. And there are two ways to do this: work out
all permutations of the integer and see if they're multiples, or work
out all multiples and see if they're permutations. The latter is
faster. In Crystal:
Standard counterifying routine.
def counterify(a)
cc = Hash(Char, Int32).new(default_value: 0)
a.each do |x|
cc[x] += 1
end
return cc
end
Do the type conversion to counterify an integer.
def countdigits(a)
counterify(a.to_s.chars)
end
def shufflepairs(low, high, pairs)
total = 0
Iterate over the range.
low.upto(high) do |candidate|
Count digits in the candidate.
candidatec = countdigits(candidate)
cnt = 0
Iterate over possible multipliers. (×10 or more will necessarily be
longer than the original. And because we only want the lowest in a
pair, we don't have to worry about numbers smaller than the
candidate.)
2.upto(9) do |mul|
Get the potential pair, and count its digits.
test = candidate * mul
testc = countdigits(test)
If we have a match, increment the count of pairs for this number. If
we have enough, drop out. (This probably doesn't save much, but one
might as well.)
if testc == candidatec
cnt += 1
if cnt >= pairs
break
end
end
end
If we have enough pairs, increment the total.
if cnt >= pairs
total += 1
end
end
total
end
Time for another lot of timings, since I haven't done one of these for
a while. I measured three successive runs of all test cases on an
unloaded system; the fastest is given here. (I think Rust parallelises
tests, but within the same process; measured load was fairly solidly
at 1.0 anyway.) Where possible, I'm leaving out compilation time; I
assume a program will be run significantly more often than it's
written. My Scala setup is frankly fragile, and I wasn't able to do
that for Scala, so it should potentially be further up the chart.
Language functionality may be a consideration here too; the languages
I've noted with "#" don't have a direct hash equality comparator (or
their documentation doesn't make it clear that they do; Raku in
particular has all sorts of things I simply can't find in the docs),
so I had to write a function for it in that language. Only a shallow
comparison is needed.
(In PostScript I found a bug in my existing deepeq deep comparator,
which was leaking values onto the stack. But a custom shallow
comparator function, which I wrote as part of the bug-finding process
and which doesn't recurse to compare individual values, is
significantly faster.)
| Language |
time/s |
RSS/kB |
| Rust (release) |
2.22 |
2820 |
| Kotlin |
3.08 |
211552 |
| Crystal (release) |
5.44 |
5328 |
| JavaScript (node) # |
10.68 |
111704 |
| Scala (w/ compile) |
19.61 |
279716 |
| Python |
24.51 |
15076 |
| Rust (debug) |
39.97 |
2708 |
| Crystal (debug) |
40.91 |
7372 |
| Perl # |
51.57 |
13612 |
| Ruby |
57.29 |
23748 |
| Lua # |
89.66 |
3056 |
| PostScript # custom |
132.37 |
30808 |
| PostScript # deepeq |
151.81 |
30724 |
| Raku # |
214.85 |
179656 |
A note for future Roger:
/usr/bin/time --format="| %C | %e | %M |" /path/to/binary
Typst failed to produce a result and went into a runaway high load
state, though the code works for the smaller and more manageable
test cases.
I'm particularly impressed with Rust's memory usage, given the
compromises Lua makes to its language in the name of low footprint:
here's something much faster and more capable, and (in this
admittedly unusual case) it uses less memory.
Full code on
github.