I’ve been doing the Weekly
Challenges. The
latest
involved checking capitalisation and decoding an ambiguous string.
(Note that this is open until 13 November 2022.)
Task 1: Capital Detection
You are given a string with alphabetic characters only: A..Z and a..z.
Write a script to find out if the usage of Capital is appropriate if
it satisfies at least one of the following rules:
1) Only first letter is capital and all others are small.
2) Every letter is small.
3) Every letter is capital.
Although in Perl I reach for them anyway, I think this is very clearly
a job for regular expressions. Each formulation can easily be coded
into a regexp:
- [A-Z][a-z]+
- [a-z]+
- [A-Z]+
then it's just a matter of anchoring them (i.e. the whole input string
must match the pattern) and testing.
sub capitaldetection($s) {
if ($s =~ /^([A-Z]+|[a-z]+|[A-Z][a-z]+)$/) {
return 1;
} else {
return 0;
}
}
Lua patterns are regexp-like but don't have alternation, so there I
check each option separately. And PostScript of course doesn't have a
regexp library at all, so… I start with a crude upper-case tester (I
don't care about characters which aren't in A-Z or a-z):
/uppercase {
(_) 0 get lt {
true
} {
false
} ifelse
} bind def
and then looking at the first two characters of the string tells me
what cases the rest may be.
/capitaldetection {
4 dict begin
/s exch def
/first s 0 get uppercase def
/second s 1 get uppercase def
first not second and {
false
} {
true
2 1 s length 1 sub {
s exch get uppercase /this exch def
first second and this not and { % ABc
pop false exit
} if
second not this and { % AbC or abC
pop false exit
} if
} for
} ifelse
end
} bind def
Task 2: Decoded List
You are given an encoded string consisting of a sequence of numeric characters: 0..9, $s.
Write a script to find all the valid different decodings in sorted order.
Encoding is simply done by mapping A,B,C,D,... to 1,2,3,4,... etc.
So for example "11" can be (1, 1) = "AA", or (11) = "K".
I used my standard breadth-first search pattern, stuffing valid match
patterns into hash keys to avoid possible duplication (not sure this
can actually happen but it was an excuse for making arrays into hash
keys), and then decoded the results. Most of the complexity is in the
string-slicing (and in Rust I ended up using a Vec<char>
for
simplicity).
Python:
from collections import deque
def decodedlist(s):
stack = deque()
A stack entry consists of a list of strings: the last entry in the
list is the remaining un-parsed part.
stack.append([s])
out = set()
while True:
ent = stack.popleft()
tail = ent.pop()
If there's no more unparsed content, we have a full parse; stick it on
the output list.
if len(tail) == 0:
out.add(tuple(ent))
else:
If we get here there's at least one un-parsed character. "0" isn't a
valid character index, but any other number is.
if tail[0] != "0":
q = ent.copy()
q.append(tail[0])
q.append(tail[1:])
stack.append(q)
If there are two or more un-parsed characters, there may be a
decoding: "26" is valid, but "27" is not. I could do this with regexps
but instead I decode it and take it as a number to check for validity.
if len(tail) >= 2:
v = int(tail[0:2])
if v >= 1 and v <= 26:
q = ent.copy()
q.append(tail[0:2])
q.append(tail[2:])
stack.append(q)
if len(stack) == 0:
break
Now we build the output list: alphazero
is the character offset
(which in ASCII will be 64, but this code can in theory work in
non-ASCII environments as long as the alphabet is sequential).
k = []
alphazero = ord("A") - 1
Decode each character, join them into a string, and put the result in
the list to be returned.
for x in out:
k.append("".join(chr(int(cs)+alphazero) for cs in x))
k.sort()
return k
Full code on
github.
Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.