I’ve been doing the Weekly
Challenges. The
latest
involved generating SVG and fitting lines to points. (Note that this
is open until 22 May 2022.)
Task 1: Scalable Vector Graphics (SVG)
Your task is to accept a series of points and lines in the following format, one per line, in arbitrary order:
For anything more than these small examples I'd have used a dedicated
XML generator (I've used Perl's SVG
library before but found it too
restrictive, while XML::LibXML
handles the fiddly stuff while still
letting me tweak details, and doesn't simply force me to learn a
different version of the same syntax). But since I don't have to worry
about escaping characters in this case, and since I'll have to write
the raw code anyway for languages that don't have an XML library…
I've borrowed the styles from the task 2 example.
Raku:
my @points;
my @lines;
my @x;
my @y;
Read the format, split on commas; we should have either two or four
numbers. Put all the X and Y coordinates into lists of their own,
which I'll use for bounds calculation, and the lines and points into
lists of lines and points.
for lines() {
.chomp;
my @f=split(",",$_,:skip-empty);
for 0..@f.list -> $i {
if ($i % 2 == 0) {
push @x,@f[$i];
} else {
push @y,@f[$i];
}
}
if (@f.elems == 4) {
push @lines,@f;
} elsif (@f.elems == 2) {
push @points,@f;
} else {
die "Invalid input $_\n";
}
}
Bounds calculation. Minimum and maximum X and Y values.
my $mnx=min(@x);
my $mxx=max(@x);
my $mny=min(@y);
my $mxy=max(@y);
My actual bounds go outside these by 10%.
my @lo=(
$mnx-($mxx-$mnx)/10,
$mny-($mxy-$mny)/10,
);
my @hi=(
$mxx+($mxx-$mnx)/10,
$mxy+($mxy-$mny)/10,
);
Which gives me overall width and height.
my $w=(@hi[0]-@lo[0]);
my $h=(@hi[1]-@lo[1]);
At this point I can emit the preamble, which includes width and
height, and I'm setting up a viewbox as well. (So the "page" will show
everything I'm plotting.)
say '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>';
say '<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">';
say "<svg width=\"$w\" height=\"$h\" viewBox=\"@lo[0] @lo[1] $w $h\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:svg=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">";
Plot the lines.
if (@lines) {
say " <g id=\"lines\" stroke=\"#369\" stroke-width=\"4\">";
for @lines -> @p {
say " <line x1=\"@p[0]\" y1=\"@p[1]\" x2=\"@p[2]\" y2=\"@p[3]\" />";
}
say " </g>";
}
Plot the points (because a line can obscure a point more easily than a
point can obscure a line).
if (@points) {
say " <g fill=\"#f73\" id=\"points\">";
for @points -> @p {
say " <circle cx=\"@p[0]\" cy=\"@p[1]\" r=\"3\" />";
}
say " </g>";
}
say "</svg>";
Task 2: Line of Best Fit
When you have a scatter plot of points, a line of best fit is the
line that best describes the relationship between the points, and is
very useful in statistics. Otherwise known as linear regression,
here is an example of what such a line might look like:
The method most often used is known as the least squares method, as
it is straightforward and efficient, but you may use any method that
generates the correct result.
Calculate the line of best fit for the following 48 points:
I produce this on standard output, which can then be fed to the
standard input of task 1.
Given this explicit invitation to try non-standard line fits, I
decided to implement the Theil-Sen
Estimator:
the slope of the best-fit line is the median of the slopes of each
pair of points, while the offset is the median of the deviations from
that line.
I used the naïve method of calculating the median; Hoare gives us the
"quickselect" algorithm which works by partitioning the set, but that
seemed like too much work to implement.
Lua:
function median(s0)
local s = s0
table.sort(s)
return s[math.floor(#s/2)+1]
end
I'll want to store all the points.
local points = {}
Read the lines, break them down, and get the numbers. (This varies a
lot between languages; this input is in a nasty format, particularly
considering the multiple spaces.)
while true do
local line = io.read()
if line == nil then break end
local f = {}
for x,y in string.gmatch(line,'(%d+),(%d+)') do
print(string.format('%d,%d',x,y))
table.insert(points,{x,y})
end
end
Calculate the slope between each pair of points (that don't have
identical X-values).
local slope = {}
for i = 1,#points-1 do
for j = i+1,#points do
if points[i][1] ~= points[j][1] then
table.insert(slope,
(points[j][2]-points[i][2])/(points[j][1]-points[i][1])
)
end
end
end
local m = median(slope)
Calculate offsets to get the Y-intercept. (Also populate x
that I'll
use later.)
local offset = {}
local x = {}
for i,p in pairs(points) do
table.insert(offset,p[2] - m * p[1])
table.insert(x,p[1]+0)
end
local c = median(offset)
m
and c
are the actual useful values, but we want to plot a graph
with them. For minimum and maximum X values, calculate Y values, and
output the set as line coordinates for the plotter.
local l = {}
table.sort(x)
for i,xb in pairs({x[1],x[#x]}) do
table.insert(l,xb)
table.insert(l,xb * m + c)
end
print(string.format('%f,%f,%f,%f',table.unpack(l)))
Full code on
github.
Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.