I’ve been doing the Weekly
Challenges. The
latest
involved combining lists and squashing unicode characters. (Note that
this is open until 16 October 2022.)
Task 1: Zip List
You are given two list @a
and @b
of same size.
Create a subroutine sub zip(@a, @b)
that merge the two list as shown in the example below.
(In other words, f([1, 3, 5], [2, 4, 6]) == [1, 2, 3, 4, 5, 6]
.)
I only met this relatively recently; it's not really something I've
tended to want to do in my programming life. Some of the languages I'm
using have a built-in function to do this; others don't. As an example
of one that doesn't, Lua:
function zip(a, b)
local out = {}
for i = 1,#a do
table.insert(out,a[i])
table.insert(out,b[i])
end
return out
end
and similarly with JavaScript and PostScript. Kotlin has a zip
but
you can't trivially flatten its List of Pairs into a single List, so I
end up doing it the hard way.
(This is made simpler because we're told the two input lists are the
same length; otherwise we'd have to decide what to do in case of a
length mismatch, whether to stop early or put in something like an
undef
.)
For the other languages, I tried to use the built-function when it was
available. Perl doesn't have a built-in, but List::MoreUtils
has
mesh
which I just wrap for simplicity.
sub zip($a, $b) {
return [mesh(@{$a}, @{$b})];
}
Rust gets a bit fiddly, and there's probably a better way to do it
with flatten()
…
fn ziplist(a: Vec<&str>, b: Vec<&str>) -> Vec<String> {
a.iter()
.zip(b.iter())
.map(|x| vec![x.0.to_string(), x.1.to_string()])
.collect::<Vec<Vec<String>>>()
.concat()
}
Python needs itertools
for chain
.
def ziplist(a, b):
return list(chain(*zip(a,b)))
Ruby is, as often, the most straightforward:
def ziplist(a,b)
return a.zip(b).flatten
end
Raku:
sub ziplist(@a, @b) {
return Array(zip(@a,@b)[*;*]);
}
Task 2: Unicode Makeover
You are given a string with possible unicode characters.
Create a subroutine sub makeover($str)
that replace the unicode
characters with ascii equivalent. For this task, let us assume it
only contains alphabets.
This is a hard thing to do right. So my answer, as with date software,
is not to do it myself wherever possible. The canonical answer for
years has been UniDecode
(for Perl, Text::UniDecode
); now there's
also anyascii
, that has more glyph coverage. Both of these are
implemented for multiple languages and leave me not writing any code
at all.
Perl:
use utf8;
use Text::Unidecode;
is(unidecode("ÃÊÍÒÙ"), "AEIOU", 'example 1');
is(unidecode("âÊíÒÙ"), "aEiOU", 'example 2');
Rust:
use any_ascii::any_ascii;
#[test]
fn test_ex1() {
assert_eq!(any_ascii("ÃÊÍÒÙ"), "AEIOU");
}
#[test]
fn test_ex2() {
assert_eq!(any_ascii("âÊíÒÙ"), "aEiOU");
}
Full code on
github.
Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.