I did Advent of Code again, and had a
really good time. Spoilers.
Language choices:
- 2018: very beginner Rust, floundered a lot, gave up after five days.
- 2019: Perl, completed
- 2020: Rust, but switching to Perl after two days; completed
- 2021: Rust, completed
- 2022: PostScript, not quite completed
- 2023: Rust, completed
I've been trying to get better at Rust (for six years now!), and it
seemed like a good choice; it's the language I most enjoy writing in,
especially when a program is more than trivial. My particular goal
this time, apart from improving my fluency, was to find crates that
would be widely useful elsewhere.
This year's problems seemed to start off harder than usual,
particularly on the parsing side; in previous years several early
puzzles have been "one number per line" inputs, but none this time. On
the other hand the difficulty by the end seemed fairly in keeping with
previous years.
I started parsing with regexps as I normally would in Perl, but I've
been trying to get away from my Perlish mindset (which makes regexps
and/or hashes the answer to almost everything). I used the nom
parser for a couple of days, but apparently its fork winnow
is more
actively bugfixed as well as rather faster, so I switched to that from
day 6. Yeah, all right, I'm copying working code rather than fully
understanding it, but I can at least produce working code.
I also ended up building plenty of custom structs and enums. This
seems to be the Rusty way to do things, even if one isn't going to go
fully OO.
I'll talk here about each day and my solutions. Should you want to
follow along, my code is at
Codeberg.
Hardest day 1 I remember, and I think it would have been improved by
annotating line 2 of the example input:
eightwothree -> [8, 2, 3] -> 83
Still, it's not actively deceptive, there's always a conflict between
making the puzzle too easy and making it too hard, and I got it on the
second try.
The parsing was the hard bit here. I should have used a parser
library.
One trick I noticed: I don't need to distinguish between comma and
semicolon pair separators (you'll see the stubs of that in the regex
design), because all I need is the doublets of (number, colour)
.
This is another one that's mostly parsing, but the two things I want
are most easily parsed in different ways.
Numbers are hooked out with a regex, because matches give me a
convenient start and end position. They go in a list.
Symbols I snag with a simple scan across the characters of the string:
anything not a digit or . is a symbol. I could have done this with a
different regex, I guess. The coordinates go in a map of sets, so that
I can quickly check for the existence of a symbol in a specific space.
(I ended up using a lot of maps of sets this year.)
All coordinates start at 1, so that I can slop over the edge (e.g.
when scanning a number at the top left) without breaking the bounds of
an unsigned integer.
With these two sets of data (numbers
and symmap
) I start the
actual problem. I look through the row above and the row below the
number, scanning for a symbol across each adjacent character position,
breaking out if I find one; if not, also look to immediate left and
right of the number. If a symbol has been found, add it to the total.
For part 2, the map of sets becomes a map of maps, because the only
symbol I care about is the *
and each one I find gets a unique ID.
All the short-cuts are removed from the symbol scanning logic because
I want to find all symbols adjacent to each number. That symbol's
number list gets the number appended to it.
Then I look at each symbol's number list and hook out the ones with
length 2.
This was my first "real world" nom code, which was why it took a
while. Guess how the input differs from the test file? More spaces
between "Card" and the card number. But now I know a little more
about how to find errors while using nom.
Of course I built a struct to represent the data on each card. Of
course I ended up not actually ever needing the ID field.
Sets (in Rust, HashSets) are clearly the right approach: they have a
built-in intersection
function, which I haven't often had a use for,
but it's jolly handy here.
That's pretty much part 1 done, apart from a bitshift. Part 2 looks as
though it might be iterative, but in the end I store a list of card
values and counts (the latter starting at 1 each), then use the value
to see how many counts I should increment (by the count of the current
card).
This was a hard one, at least in part 2. Parsing with nom was
reasonably straighforward, and I put together a custom data structure.
(Is there a better way of mapping a list of strings to an enum? Looks
as if the answer is in
strum.)
Then it's just a matter of rattling down the chain remapping each
value in turn. Easy enough
Some people brute-forced part 2 but the oom killer got me and I gave
up for the day: I could see how to do it, but didn't have the
enthusiasm to continue. Specifically, for each range in the current
input, check it against the ranges of the map to which it's being
subjected, and potentially break it into multiple ranges (up to three
per map line, if the map range is entirely included within the input
range) which will each be subject to different deltas.
Fiddly detail was made easier by the range_ext
traits
which let me casually test a range of numbers against the ranges in a
series of maps—not to mention encouraging me to use the built-in
Range
class rather than mucking about with my own. (And while
bottom-inclusive top-exclusive offends my sense of symmetry, it did
involve fewer "+1"s and "-1"s which are always easy to spray into the
wrong place.
Still a bit of a slog, but done!
Today I shifted to winnow (because the author said I should), and did
the crucial thing of building up the parser step by step with known
good data so that it worked at each stage and so when it didn't work I
knew which bit was going wrong.
Clearly I don't want to count options. But the distance travelled for
a race of time
and a button press of press
is
(time - press) * press
and I beat a distance
record if
(time - press) * press - distance > 0
i.e.
- press² + press * time - distance > 0
a quadratic, which I can solve for its zero crossing points.
a = -1
b = time
c = -distance
roots are ( -time ± sqrt(time² - 4 * distance) ) / -2
Sadly, I'll have to use floating point for this. Which means I have to
consider a special case: what if a zero-crossing point falls exactly
on an integer? (I.e. I'm exactly matching the record.) Test race 3
provokes this: 10 and 20 seconds of pressing both produce a distance
of 200. So for the low root I round it up, then add one if the
rounded-up value equals the original value; for the high root, round
down and potentially subtract 1.
Then the number of record-beating runs is highroot - lowroot + 1
.
Because I've used this approach, part 2 is just a matter of tweaking
the parser (and jumping up to 64-bit ints, since my input at least was
a 48-bit number for distances). I get the impression that quite a few
people who didn't take the quadratic approach were able to brute-force
this.
I got very Rusty with this one (by my standards).
Hand strengths go into an enum (low to high), which gets a bunch of
auto-derived traits so that I can sort on it later.
The actual hand is just a vec of u32s, found by indexing into the
string of card values. (I could have done another enum for them, but I
didn't feel like it.)
As I parse each line, I also derive the hand strength by throwing the
card values into a Counter (basically a HashMap that auto increments,
so [a, a, b, b, b]
would become (a: 2, b: 3)
).
Then there's a bit of an ugly match where I switch on the number of
distinct values (i.e. count of items in that Counter): 1 is obviously
five of a kind, 4 is one pair, 5 is high card. 2 is four of a kind if
there are 4 of any one card, otherwise full house; similarly, 3 is
three of a kind if there are 3 of any one card, otherwise two pair.
So at the end of parsing I have a vec of hands… which I then sort.
(There's probably a better way to sort on all the members of a vec in
order, but I don't know it, and these are always five entries long.)
The hands are now in ranked order.
So all I need to do is iterate through them and multiply the bid by
the index plus one.
Part 2: well, it turns out to be fairly straightforward. Drop the jack
to the start of the card list for ranking purposes; then make the
counter mutable, and strip out the number of jacks to a separate
variable.
Since in this hand system more matching cards is always better, I just
set the maximum number of any single card to the true maximum number
plus the number of jacks. (There's an all-jacks line in my puzzle
input, so a bit of extra logic is needed to calculate the highest
counter value as .max()
on an empty list returns None
.)
Then run it through the same logic (plus a new line for special case
zero, that all-jacks hand again) and all is done.
(Deferred by Airecon NW.)
Wrestling with winnow
today: couldn't get repeat()
to work for me.
I'll have it eventually.
Part 1 is basically trivial with the right data structures (I used a
HashMap of tuples).
For part 2 I was clearly looking for cycles, but this could have been
more complex than it was: I might take 300 steps to get from _A to _Z,
then only 200 steps to get from _Z back to _Z again forming a loop,
but as it turned out all the paths had these two values the same. I
ran each one for long enough to establish these cycles (30-40K each,
twice as long as the actual cycle, because I exited when I got to a
position of (node, state of cycle) that I'd previously seen), then
calculated the lcm of the results.
There's a lot more cloning and copying than I'm happy with. I should
probably have converted the node strings into numerical ID values.
Got a sequence parser thingy working in winnow
, so now I can pull
the space-separated list without manually wrangling the string
parsing. Yay.
Pretty straightforward part 1, enough for me to use recursion, which
isn't usually my style. Basically, if all the values are equal, return
the same value again; otherwise, work out the differences and go down
another level.
Part 2: huh. I am actually a little shocked given that weekend puzzles
are usually a bit tougher. Well, it turned out that this was very
simply indeed: in altering find_next
to find_prev
, all I have to
do is change the recursing return.
Now this got a bit meaty.
For part 1, a bit flags struct for the various directions. (Could have
done it with an int, but this way I'm constantly reminded of what
means what.) And a path note struct for x, y, length to here, and
direction I entered from.
Parse with winnow
, repeat
and one_of
, then match the character
to get a pair of directions. The start node gets all directions in a
(vain) hope of avoiding special-casing later.
I probably should have extended the parser to report the Start
position too but ended up doing this with a plain string find
.
Then it's a breadth-first search using a Vec::Deque
. If there's no
access into the current node in the direction we came from, delete it
(fortunately there were no false trails longer than 1). Distance to a
specific node gets recorded in depth
, and when I find a node with an
existing depth value I've got my midpoint..
All that stays in place for part 2. Then I process the rows by
changing any node that's not part of the critical path to one with no
exits.
To work out the squeezes, I double up the grid. Within each new row I
calculate open squeezes if:
-
this spot I'm looking at is on the path
-
this spot doesn't have an exit east
-
the next spot to the east doesn't have an exit west
Between rows, I calculate similarly: the spot above is on the path, it
doesn't have an exit south, and the next spot south doesn't have an
exit north. Spots that are offset from both row and column originals
can be moved through freely (because we have no diagonal moves).
Accessible spots, whether they are in the original or not, are
represented by a '.'; other characters represent different sorts of
blockage.
Then I do an exhaustive search to tag every spot accessible from the
outside (each of which is turned to '#').
Finally, scan my doubled rows (only the odd numbers) to find and count
any remaining accessible spots.
A fortuitous optimisation stood me in good stead.
I looked at the "expand the universe" phrasing, and it smelled bad. So
instead I build three sets:
stars
, (x, y) star coordinates
somex
, columns that contain at least one star
somey
, rows that contain at least one star
Then for each pair of stars (A, B), for x and y coordinates, start with a
distance component of the difference between A and B, then add one for
each value between them which doesn't appear in some
.
Then part 2 is simply a matter of replacing the "add one" with "add
999,999". Runs in about 3.8 seconds without optimisation for the
440-odd stars in my puzzle input.
Phew! Toughest yet, and I did 13 and 14 while sitting on this.
I got part 1 working with code that couldn't be effectively optimised
for part 2, so ended up reworking it all from scratch.
I was inspired by, and effectively reused, Bill Mill's Python
code,
which I translated into Rust for a new part 1 solver. (In Python you
can slice at the end of a string or list to get an empty string or
list, which you can't in Rust, so I ended up rearranging the logic a
bit.)
Then for part 2 I already had an input expander from my first attempt,
and just had to find out the onstraints on the memoize
crate. (Yeah,
I could have used a HashMap, and done it by hand but one of my goals
this year is to learn more crates.)
(Back on schedule for this one.)
Relatively straightforward apart from the bounds checking. Duplicated
code, which is a shame, but it's the clearer way of doing it I think.
In either case, I build a range of x values that'll be valid in
reflection, then check each pair, dropping out if I get a failed
match. Then do the same for y. My input has only one symmetry line in
each, but this could cope with more—just remove the break
s after if valid {
and the if summary == 0 {
wrapper.
For part 2, rather than trying every flip, I drop the break
on first
error and count the errors in this symmetry line. If it's zero, that's
the answer to part 1; if it's one, that's right for part 2.
Moving all movable rocks one step is a fairly straightforward row
traversal (skipping the top row). Calculating the load is similarly
fairly easy.
Part 2: well, whenever I see a stupidly large number in AoC I know I'm
not supposed to iterate that many times. So instead after each cycle I
store a copy of the state of the platform, and stop when I get to one
I've seen before. (In the test data, cycle 10 matched cycle 3.) That
means cycle silly_number will be the same as cycle
(silly_number - now) mod (cycle_length) + previous_matching_cycle
so I search through the cache and return the state from that cycle.
Wrestling with winnow
took up a lot of time today, to the point that
I ended up just parsing via string splitting.
The hash function is easy enough and that's most of the work for part
1.
For part 2, it was mainly a matter of picking a data structure.
Ideally I'd have liked a hash which would retain the order of keys
even if they were updated; I didn't find one, so given the small size
of the problem I just search through a vector in each box, then add or
remove elements in the right place.
A little fighting with lifetime later (thus the change from &str to
String) and it's done, worked as soon as I'd finished writing it.
I did this as a depth-first search (since I don't care about the time
taken to reach a particular point) with two hashsets: energised
,
keyed on position, when the cell is struck by a beam, and beamseen
,
keyed on a (position, velocity) tuple, when the cell is struck by a
beam going in a particular direction. (All velocities in this problem
have magnitude 1, so I could just have called this "direction", but I
didn't.)
To avoid worries about borders, I surround the character grid with a
blocking character. If a beam hits that, it's ignored and not
propagated further.
Then there's a switch on the character encountered in the new cell,
which may produce one new beam (open space, mirror, or splitter in
the wrong orientation) or two (splitter in the right orientation).
(Probably it would have been cleaner to set a new velocity then use
that to calculate the new position. Never mind.)
For part 2, that gets wrapped up in a function, and I simply run the
start coordinates (with appropriate velocity) along each edge, then
take the maximum value.
This would have been quicker but I wanted to do a thing. Specifically
I wanted to use the Rust pathfinding
module.
Lots of functions today. A Dir
enum (including Dir::None
for the
starting space); delta_x
and delta_y
functions to get the actual x
and y movements from a Dir
, and turn_left
and turn_right
to
unify those calculations in one place.
Then the Pos
structure gets not only x and y, but current direction
(i.e. the direction from which this cell was entered), and the number
of moves in a straight line. The same coordinates in a different
direction, or with a different straight line length, will have a
different set of successor positions.
Most of the work gets done in the successors
method to Pos
. We
build a list:
-
If we came from Dir::None
, a move in each possible
direction.
-
If we've gone fewer than three in a straight line, a move in the
same direction.
-
In any case, two 90 degree turns.
Each of these results in one ore more new Pos
es. Then we take a
second pass to check that the new Pos
is within the grid, and if so
stick it on the actual output list with its cooling value.
From there I can just call dijkstra
with the starting position, the
successors function, and the goal condition—which is that the x and y
coordinates match, because we don't care about direction and
straightness at the end.
The hardest bit, in fact, is that row value lookup. I'm establishing
the values in the parse stage, but I want to be able to reference them
from down in the guts of Pos
. This in turn means I need a mutable
global variable; which Rust hates, for good and sufficient reasons,
but that's why there's a sprinkling of unsafe
all over the place. (I
was sent a trick to fixing that, which I used on day 21.)
Because I've laid all this out in a fairly straightforward way, part 2
just needs me to change the latter two list conditions:
-
If we've gone less than 10 in a straight line, a move in the
same direction.
-
If we've gone at least 4, a 90 degree turn.
Well done, Eric, you fooled me.
My immediate thought on part 1 was the shoelace
formula. But no, I
thought, that'll be too hard when it comes to considering which pit
segments are coloured how and I'll have to write it all again. So I
did a long and over-complex grid layout calculation.
Then part 2 wasn't "consider the colour of each pit", it was "waaaay
bigger numbers".
After that I went back and rewrote part 1 with the same code.
(Different parser modules, same processing.)
The complicated part here is that the pits are of non-zero width. The
shoelace formula is in effect counting the area enclosed by the
midpoints of each pit; so I add to that half the perimeter length.
Another Tuesday, another tough one. Part 1 was great fun, made easy of
course by the parsing library and lots of custom structures.
Part 2 was, well, day 5 all over again, though without the same
complexity of range overlaps. I bashed against this for a while, but
ended up settling on a depth-first search carrying a node name and a
structure of ranges. Inequality testing is replaced by partitioning
one of those ranges.
As with day 19, I found part 1 more fun than part 2. I got to build
lots of custom data structures, and then do a bfs queue to move all
the pulses around and set input states.
(This time I converted the alphabetic IDs into numeric values for ease
of indexing.)
For part 2, I took a punt and looked at the network layout.
Specifically, the thing that feeds rx is a Conjunction module, so I
look for cycles in the things that feed it. I count cycles between
instances of the high pulse being sent to each input of that upstream
Conjunction, assume those cycle lengths are stable (the same
optimisation as day 8), and lcm the lengths together. (Thus taking
about 1 second to run rather than several hundred years.)
Part 1 was good fun - mostly a matter of setting up data structures so
that I could use Dijkstra. dijkstra_all
gives me the shortest path to
every cell on the grid, which is the number of cycles it's taken to
reach it.
This is a hard-coded answer: the number of cycles has to be even
(otherwise the test for cycle number would have to select odd ones,
and I wouldn't add 1 for the starting cell; see part 2 for a fuller
version which has to cope with both odd and even numbers).
Part 2 won't work with the Dijkstra module, because the dijkstra_all
function solves for the shortest path to every cell on the grid, which
doesn't work well on an infinite grid.
So this needs some geometrical insight:
-
the wavefront can advance no faster than 1 per cycle;
-
because of the empty row and column aligned with the start point,
and the diamond-shaped gap in the input grid, it will advance that
fast (this is not true of the test data!);
-
every filled subgrid will contribute the same number of cells, and
that will increase quadratically each "great cycle", i.e. cycles
equal to the grid height.
So I need to take three measurements, the number of visited cells
half-way through each of the first three great cycles (i.e. when the
wavefront is on the edge of a new cell), which means the supergrid is
a more manageable 7×7 subgrids. Dijkstra once, and count the cells
satisfying present at each of the cycle values. From those
measurements I build a quadratic equation with great cycle values (1,
2, 3), work out the great cycle value of the total number of steps
(again, it's half way through a great cycle), and insert that to read
off the answer.
(After a lot of off-by-1 errors which got magnified in the
calculation…)
A bit of a relief after 20 and 21. Parsing with winnow
of course,
and (perhaps informed by 21) I have a vec of blocks rather than trying
to lay out the whole grid. Each block has an x, y and z range, and a
list (well, a hashset) of things it rests on and another of things it
supports (both initially blank).
Then it's a matter of dropping each brick, and my secret weapon is to
sort them by Z value before dropping each one as far as it will go.
(Because they're all cuboid, we won't get any shapes that interfere
with each other.) Here's where day 5 comes in handy again, because
I've already found the range_ext
crate that gives me a nice easy
intersection test. So the drop procedure will lower each block until
either it hits the ground or its ranges (all of them!) intersect with
another block, then take one step back up; if the ranges intersect I
store those blocks in its restson
field.
Then look up all the restson
values to populate the corresponding
supports
values.
Then, for each block, look at all its supports
. If any of them has
only one restson
, i.e. this block, it'll fall if this block is
removed. That's part 1.
For part 2 I run a queue. For each brick I clone the structure, then
add that specific brick to the removal queue. As an item comes off the
queue, I go through its supports
and remove it from their restson
fields; if there aren't any left, that brick will fall, and it gets
added to the queue.
I thought this might need memoising, given that part 1 was a but slow,
but it only took 7 seconds in debug mode Rust, 1.2 in release mode. It
would probably still help, but I don't actually generate a definitive
fall value for anything except the brick at the root of the collapse.
Perhaps if I sorted them in descending post-fall Z order?
Here I fell a bit behind after an extremely tiring Saturday; I
finished this one on Christmas Day, and 24 and 25 on Boxing Day.
Well, I quite enjoyed part 1.
Thought about winnow
parsing, but just colecting the chars into a
grid structure was fine. Slightly unusual search constraint but a BFS
worked all right. Done.
Then part 2 came along, and the BFS doesn't work as well when there
are possible optional loops. This time I built Coord
and Pathgrid
structs containing valid nodes (still no winnow
, just the lovely
match_indices
. And a multi-pass approach:
-
Find "intersections", defined as path points with at least 3 exits.
-
Group those intersections, i.e. find pairs which can be visited
without going through any other intersection in between. (And since
my BFS here gives directional results, set up reverse paths too.
Yeah, should have put this part under the next point.)
-
Evaluate maximum path costs between each set of intersections. Lots
more BFSes, each one basically a part 1 solver in miniature.
-
Use the network of intersections in a final BFS to generate the
most expensive path from start to goal.
Part 1 was a basic intersecting lines problem, with the small
additional constraint of the intersection needing to happen at
time > 0
Fair enough, quite fun.
Then came part 2 and while I remember doing all this algebra at school
it's pretty rusty, so I rather lost enthusiasm. Thanks to cideM for
the gaussian elimination code at
github
which I reused more or less wholesale.
Well this was a bit perverse, and I don't really count it as a
solution in Rust. Instead, I converted each dataset into
GraphViz format (via program a
, but it's
pretty trivial), then ran it through neato
, one of the many ways of
getting a graph that's arranged automatically.
Looking at the output (hint: use one of the vector-mode output
formats! so that you can zoom in effectively!), it was very easy to
see where the links should be cut.
Then I wrote program b
, which hard-codes the link cuts (the line in
the published code is for the test dataset), builds a map out of the
input and traverses it from an arbitrary node to see how many nodes
are left in its partition—combine that with the total map size to get
the answer.
Conclusions
What I learned:
-
Option
values and how to handle them
-
custom enums and structs, a bit more
-
references, a tiny bit more
-
the built-in Range
class (and ext_range
if intersecting them).
-
the winnow
parser crate (a hidden benefit of this was that I knew
I had a working parser before I tackled the actual problem, and the
library makes it difficult to build a parser that works but returns
wrong values).
-
the pathfinding
crate
-
the memoize
and bitflags
crates (just once each)
My revised list of things to be able to do in your language of choice
if preparing for an AoC:
-
string parsing, with regexps or combinators or whatever works,
especially breaking down a string into specific arguments;
-
multidimensional arrays (or whatever your language uses to
substitute for them);
-
more generally than that, coordinates. Yeah, a basic list of rows
each containing a list of cells will work, but it would have been
handy to have a prepared way of dealing with directions, direction
changes, bounds checking, and so on, to the extent that I may write
a Grid class for next year;
-
depth-first and breadth-first search;
-
Dijkstra or A* pathfinding (Dijkstra has always been ehough for me
but A* would probably be handy to know); it might well be worth
finding or writing or learning a generic pathfinding module with
lots of hooks so that you can have that separate from the specifics
of the problem;
-
Conway's Life (it'll probably be back next year);
-
modular arithmetic at least as far as the Chinese Remainder Theorem
(again, not this year but it'll probably be back);
Puzzles sorted by Part 2 difficulty: this percentage is the proportion
of people who have successfully solved part 1, but not part 2, which
is the only public information about completion rates. (Based on a
snaposhot at the time of writing, so values may have changed.)
- 6: 2%
- 9: 2%
- 16: 3%
- 11: 4%
- 2: 4%
- 17: 5%
- 22: 7%
- 7: 8%
- 15: 9%
- 13: 12%
- 4: 12%
- 3: 13%
- 14: 17%
- 8: 17%
- 18: 18%
- 20: 21%
- 23: 21%
- 1: 24%
- 10: 26%
- 19: 26%
- 5: 28%
- 12: 33%
- 24: 37%
- 21: 45%
(Day 25 doesn't count, as part 2 is essentially "have the other 49
stars".)
Thanks as always to Eric Wastl and the AoC team.
Comments on this post are now closed. If you have particular grounds for adding a late comment, comment on a more recent post quoting the URL of this one.