I’ve been doing the Weekly
Challenges. The
latest
involved string mangling and regular expressions. (Note that this ends
today.)
Task 1: Alphabet Index Digit Sum
You are given a string $str consisting of lowercase English letters,
and an integer $k.
Write a script to convert a lowercase string into numbers using
alphabet positions (a=1 — z=26), concatenate them to form an
integer, then compute the sum of its digits repeatedly $k times,
returning the final value.
Getting character codes is one of those things that varies hugely
across languages. JavaScript:
function alphabetindexdigitsum(a, k) {
Start my working string.
let st = "";
Look at each character in the input string
for (let c of a.split("")) {
Calculate its alphabetic code, and append the ASCII representation of
the base-10 representation of that code to the working string.
st += (c.charCodeAt(0) - 96);
}
Convert that to an integer. (Not strictly necessary here I think,
since floppy types will probably treat it as a string anyway, but I
solve these first in Rust, and its type enforcement has been so good
for spotting the kind of trivial error I used to make a lot in Perl
that I tend to do it explicitly elsewhere too.
let v = 0 + st;
Run through a number of cycles.
for (let _dummy = 0; _dummy < k; _dummy++) {
Of course I could convert the number back to a string, split it into
digit characters and add them together. But I like to avoid type
conversions where that's possible, so I do it mathematically instead.
(This would conveniently also work for base 2, base 327, or any other
base.)
let j = 0;
while (v > 0) {
j += v % 10;
v = Math.floor(v / 10);
}
v = j;
}
Return the final result.
return v;
}
Task 2: Valid Token Counter
You are given a sentence.
Write a script to split the given sentence into space-separated
tokens and count how many are valid words. A token is valid if it
contains no digits, has at most one hyphen surrounded by lowercase
letters, and at most one punctuation mark (!, ., ,) appearing only
at the end.
Since this is essentially a ladder of regular expressions connected by
simple logic, it looks very much the same in every language, so I
didn't bother for most of them. (And Raku's weird divergent syntax for
its "regular expressions" just irks me.)
Perl:
sub validtokencounter($a) {
Initialise the counter for the final result.
my $count = 0;
Look at each word-token.
foreach my $k (split ' ', $a) {
Check that it contains no digits.
if ($k =~ /[0-9]/) {
next;
}
Check that it doesn't have multiple dashes.
if ($k =~ /-.*-/) {
next;
}
Chec that, if it does have a dash, that dash is surrounded by letters.
if ($k =~ /-/ &&
$k !~ /[a-z]-[a-z]/) {
next;
}
Check that there is no punctuation mark followed by another character.
(This combines "at most one punctuation mark" and "appearing only at
the end".)
if ($k =~ /[.,!]./) {
next;
}
We've passed all the tests, so increment the counter.
$count += 1;
}
$count;
}
Full code on
codeberg.