I did Advent of Code again, and had a really good time again. Spoilers.

Carrying om from where I left off last post, I'll talk here about each day and my solutions. Should you want to follow along, my code is at Codeberg.

Day 7

Part 1 screams "HashSet" to me. (In fact there might have been an even easier way to do it, just count all the splitters and assume each one gets used, but I'm glad to say this isn't the case even in the test data.)

I keep a set of active beam X-coordinates (initialised when I find the S). For each row, I make a new copy of that set; then each time I find a splitter that overlaps with an active beam, I increment the splits count, delete that beam and add two more at (x - 1) and (x + 1). (The set takes care of deduplication.) There are some potential undefined behaviours: what happens if a splitter is at the edge of the field, and what happens if it's next to another splitter? But neither of these happens in either the test or the liva data. Then at the end of the row calculations I move the new set back to the old to be ready for the next row. It's a sort of one-dimensional cellular automaton.

For part 2 I upgrade the set to a map. Start with a value of 1 for the initial beam; then each split increments the (x - 1) and (x + 1) values by 1, with a default initial value of zero.

Someone talked about memoisation and if I were trying to trace every path (e.g. with BFS/DFS) I'd probably have to do something like that; but there is no meaningful distinction between paths once they arrive at a given point, so a count is all I need.

Day 8

I thought this was going to be a spanning tree, but it turned out to be connected components. Quite a lot of data wrangling to get there, though.

Step one, get the coordinates. Step 2, generate a list of distances between each pair.

It's good to see an actual straight line distance in one of these; usually they use Manhattan distance, which I find rather dull. Of course, because I'm comparing the distances, I don't need to take the square root.

Step 3, sort that list and take the top N (10 for the test case, 1000 for the live input). Build a map of edges (ignoring distances now, but a bidirectional one for greater efficiency at the cost of memory since I need to look up neighbours from either end). Also build a list of live nodes; I probably didn't need to do this.

Then throw the lists at connected_components from the pathfinding crate, segment what I have, and pull off the three highest values.

For part 2, the start is similar, though I explicitly don't want to keep the list of live nodes because I care about what happens to all of them. connected_components now moves inside the loop, and I have to consider the possibility of a node with no neighbours. (I suspect I'm losing a lot of speed here, but it's fast enough for today's problem.) Once all nodes are connected, I look at the nodes making up the latest connection to get my answer.

(Might spanning tree be the answer after all? Perhaps, but I don't think it would readily give me the latest link.)

Day 9

First one that's needed serious effort this year. (No complaints.)

Part 1 is more or less just the coordinate parsing, fairly trivial.

Part 2 is quite crunchy. The number of cells to check is large. And winding numbers don't work reliably because the border is of non-zero width.

Something similar came up in an Everybody Codes problem this year and I ended up compressing the coordinates, i.e. paying attention only to the rows and columns that contain at least one corner, so that for example the test case

..............
.......#XXX#..
.......X...X..
..#XXXX#...X..
..X........X..
..#XXXXXX#.X..
.........X.X..
.........#X#..
..............

ends up looking like

.#X#
##.X
#X#X
..##

I still need to mark the "inside" spaces, so I ended up first marking the edges ("X") above), then looking for an easy-to-spot inside space (first occurrence on a line of a filled space followed by an empty space followed by a filled space, which might not always occur but if it does it's definitely inside the shape). Then I flood-fill from there.

Then it's back to looking at each pair of corners, and evaluating all spaces contained within that pair to ensure that they are in the greenred tile set. (Only doing the evaluation if the area will be larger than one I've already validated, which probably saves a bit.)

Those of you who've followed my Everybody Codes entries this year will notice that I got the coordinate compression a bit neater, by defining a separate struct for CoordCompressed so that the type system would warn me if I tried to use one as the other.

This still isn't as fast as I'd like but it's at least credible.

Day 10

I had a lot of fun with part 1, rendering everything as BigInt bitfields and then doing a breadth-first search with XORs. (Clearly, it's only meaningful to press each button once at most.)

Then it became clear that part 2 was (a) a linear algebra system and (b) not susceptible to short cuts (e.g. eliminating one variable at a time), so I had to find a suitable solver (i.e. one documented for normal people rather than mathematicians who already fully understand the problem domain and what needs to be tweaked). ndarray-linalg made no sense to me at all and the example code doesn't even work in the current version, but microlp got the job done even if it does have to work in floating point.

I can constrain the variables slightly, though I probably don't need to; for example, a button that increments field 0 can't be pressed more than (final value of field 0) times.

Then I run through the goal fields and for each one add an equation based on the variables that affect it.

Finally fire up the library to do the hard part.

Day 11

I got a step up on this by knowing about the pathfinding crate, so that I had an efficient path counter available. From there it was a matter of massaging the data into an appropriate shape.

And one trick I've used in AoC before is to take short strings and convert them to base 36 numbers. (Base 26 from A to Z would work if, as in this case, there are no digits, but 36 is built in to the language.) An unsigned 16-bit integer will fit three letters, as we have here, and a 32-bit will fit six. (If I needed to get the values out again, e.g. for debugging, I'd need to use the radix_fmt crate because that isn't a built-in feature,) Granted, most languages aren't as fiddly as Rust when it comes to passing strings about the place, but I suspect all of them are faster with simple numbers.

Anyway, I build that into my standard directed graph structure, a HashMap of HashSets, and then pour it into the path counter.

For part 2 I was concerned about possible loops, which will obviously break a path counter, but I hoped for the best and it worked. Each overall path can be broken down into three steps: start to first internode, first internode to second internode, second internode to end; first internode will be either dac or fft, and second internode will be the other one.

So for each ordering of the internodes I calculate the number of paths in each leg, then multiply them together. For example, in this network:

  B   J
 / \ / \
A-C-I-K-Z
 \ / \ /
  D   L

there are three paths from A to I, and three paths from I to Z, so there are nine distinct paths overall. This has the additional advantage that if there are no paths between a particular pair of nodes - as in the test data from dac back up to fft - I can skip any further calculations for that ordering of internodes.

Indeed, inspecting the network diagram (generated with graphviz and neato), it's clear how the solution falls out: in the test data, there's only one path from svr to fft, only one from fft to dac, and then two from dac to out. (The Codeberg repository includes the quick Perl d2dot to generate the .dot file, and the testB.svg result from feeding the test data to dot.)

The same pattern shows up in my full data: there are six distinct clumps of nodes with smaller connecting groups between them. fft is in the second and dac in the fifth, and there's no way of getting back from dac to fft.

(This was probably inevitable. A path counter will break if there are any possible loops, and if each of dac and fft can reach the other there must be at least one.)

Day 12

This is clearly a Really Hard problem. I don't just mean for me, I mean (having read a bit about packing problems) probably NP-hard. I could obviously parse the shapes and write a search, but I think the search space would be a Bit Huge. (8 potential configurations of each shape under flipping and rotation, though symmetry might reduce that; 300-ish total shapes to fit under each tree, 100+ top-left positions in each tree grid… 8 ^ 300 is rather a lot, even if each shape can be expressed as a 9-bit number.)

So maybe Eric knows this is a hard problem. I mean, Eric's a smart guy, right? Let's try winnowing it down a bit. Maybe some precalculated rectangles?

There's an obvious upper bound on what I can fit into one tree: if it's X×Y, I can't put more than X×Y # marks into it even I chop them up and pour them in individually (i.e. they all fit together perfectly). Let's try that to cut down on the calculation time by eliminating the clearly impossible. That's basically a parsing problem, and I reach for regexps here, dispatching based on what matches: the start of a shape (get the index number, make sure there are enough values in my area-counting Vec), a line in the body of a shape (count the #s), a tree (work out maximum area, work out area used, add to the count if it might fit).

Well the total is a bit less than half the number of trees in my input. Let's submit it before I start writing the actual packing code.

Turns out it's the answer.

If the intended lesson here is that part of being a good programmer is recognising when a problem is intractable, I'm not going to argue.

Conclusions

Nothing stretched my Rust, though several days stretched my algorithm design skills. That seems like a good thing.

Crates used:

itertools (day 9 for an efficient MinMax)
microlp (day 10)
num (day 10 for BigInt)
pathfinding (days 8 and 11)
permutator (day 9 for "every pair")
range-ext (days 2 and 5)
regex (days 1 and 12)

Interesting to see basically no actual pathfinding this year, though there were several grid-related problems. (Also no cellular automata or Chinese Remainder Theorem.) I think I will have to say, with some reluctance, that solving linear equation systems may well now be considered a standard possibility.

In particular there wasn't anything that needed fiddly parsing this time,

Puzzles sorted by Part 2 difficulty: this percentage is the proportion of people who have successfully solved part 1, but not part 2, which is the only public information about completion rates. (Based on a snapshot at the time of writing, so values may have changed.) As usual, I omit the last day, because the qualification for part 2 is "have done all the other days".

4: 4% 8: 5% 5: 11% 2: 13% 7: 14% 3: 15% 6: 15% 11: 22% 1: 23% 9: 46% 10: 49%

I'd agree that 9 and 10 were the hardest (though if I'd had a linear equations library in mind from the start 10 would have been trivial). I suspect 11 suffers from path counting being quite hard and not everyone having code available to do it. 1 was indeed unusually fiddly for a first day problem.

6 and 7 were the weekend problems, and clearly I'm not the only person who found them relatively easy.

Thanks as always to Eric Wastl and the AoC team.

RogerBW's Blog