|
|
|
The problem's phase space consists of pairs ("current step (modulo number of directions)", "current node"),
|
|
|
|
with the total of ~200 thousands states for the puzzle input.
|
|
|
|
|
|
|
|
For every state, there is a defined transition to the single next state.
|
|
|
|
|
|
|
|
There are six starting states (all pairs with "current step" being zero and "current node" ending with 'A').
|
|
|
|
And 263*6 ending states (all pairs with any "current step", and "current node" ending with Z).
|
|
|
|
|
|
|
|
Transitions are periodic; since successor of every state is clearly defined, and there are finite number of states,
|
|
|
|
this means that no matter at what state we start, we will eventually find ourselves in a loop with the length lower than 200k.
|
|
|
|
There might be several non-intersecting loops.
|
|
|
|
|
|
|
|
## Solving with brute force
|
|
|
|
|
|
|
|
One way to solve the problem would be to use some complicated math in order to compute the result.
|
|
|
|
Another, to brute force the result naively, by doing what the puzzle describes:
|
|
|
|
running several "ghosts", one from each starting state, and on every step checking if all the current states are "ending".
|
|
|
|
|
|
|
|
In order for brute force to work as fast as possible,
|
|
|
|
we need to reduce the number of conditions, dereferences and computations within the loop.
|
|
|
|
|
|
|
|
There is only so much that we can do regarding the storage
|
|
|
|
(200k states means at least 18 bits per state to store the next state, times 200k that's 450KB,
|
|
|
|
way larger than any L1 cache).
|
|
|
|
|
|
|
|
For simplicity, here I store states in array of 270*1024 u32 (i.e. one megabyte),
|
|
|
|
still just a bit more than a modern L2 cache per core;
|
|
|
|
and the array layout is optimized for access: index is "current step" * 270 + "current node",
|
|
|
|
so on every step we stay more or less in the same region of the array
|
|
|
|
(we traverse 1k entries, or 4KB of memory, on average for every step).
|
|
|
|
|
|
|
|
For simplicity, in order to check that the state is "final", I slightly renumber the list of nodes;
|
|
|
|
nodes that end with Z get the high three bits of their 10-bit index set to 1
|
|
|
|
(since the total number of nodes in the sample input is 770).
|
|
|
|
Unfortunately, the puzzle input contains collisions
|
|
|
|
(there are "final" nodes on lines 320 and 694, with the same last seven bits),
|
|
|
|
so I had to manually reorder the puzzle input;
|
|
|
|
it was easier to move all nodes ending with Z to the end of the file,
|
|
|
|
to make sure that there will be no collisions.
|
|
|
|
This way, the state is final iff it has its eight, ninth and tenth bits set.
|
|
|
|
It's also easy enough to check all six current states at once
|
|
|
|
(just bitwise-and them all, bitwise-and the result with a `0b1110000000` mask, and check that the result matches the mask).
|
|
|
|
|
|
|
|
So ultimately, every step is just six bitwise-ands, one comparison
|
|
|
|
(which is only true once we found the result, meaning that there is no performance penalty for branch misprediction),
|
|
|
|
and six dereferences and assignments.
|
|
|
|
|
|
|
|
The resulting performance (single-threaded) on my laptop is
|
|
|
|
over 100 million steps per second, or 500 billions per hour.
|
|
|
|
|
|
|
|
The correct result for the puzzle input I got is around 15 trillion, I got it in ~30 hours.
|
|
|
|
|
|
|
|
## Solving with math
|
|
|
|
|
|
|
|
Another option, with math, would be to iterate over all possible direction numbers,
|
|
|
|
and for every direction number (out of 270), and for each permutation of final nodes (6^6~=47k) compute:
|
|
|
|
For each one out of the six starting states, how many steps does it take to get to this node? And to get to it again?
|
|
|
|
(Answering that question with brute-forcing would require on the order of 200k operations for every starting state and final state,
|
|
|
|
and another 200k for every final state, so that's about 200k * (270 * 6 + 270 * 6^6) ~= 2 billion operations
|
|
|
|
to precompute all ~10k values,
|
|
|
|
but it can be optimized if we would identify the shape of transitions,
|
|
|
|
and untangle the transition matrix into a set of loops, and of paths leading to these loops).
|
|
|
|
|
|
|
|
The answer to such a question would have a form of a_i+b_i*k, for some a and b, for every integer k>=0.
|
|
|
|
Knowing a and b, for each of the six questions, we could use arithmetic to find A and B such that
|
|
|
|
for every k>=0, A+Bk steps from the starting states produce exactly this configuration.
|
|
|
|
With A being the first time when we reach this configuration.
|
|
|
|
|
|
|
|
And then we would just need to find the smallest A for all ~10 million configurations.
|
|
|
|
|
|
|
|
But I can't be bothered to do this now.
|
|
|
|
|
|
|
|
## Solving as everybody else did
|
|
|
|
|
|
|
|
It seems that most people computed "steps to the first xxZ" for every starting xxA position,
|
|
|
|
and then just computed lowest common multiple of these six numbers.
|
|
|
|
|
|
|
|
Such an answer would only be correct if all of the following assumptions hold:
|
|
|
|
|
|
|
|
1. From any xxA point, only one xxZ point can be reached
|
|
|
|
(counterexample, even with a single direction: `AAA -> XXZ -> YYZ -> XXZ`);
|
|
|
|
2. From that Z point, time until next entry into this point does not depend on the current step number
|
|
|
|
(i.e. we always return to that point in `N` steps for some `N`, and never return to it in less than `N`);
|
|
|
|
3. Number of steps between Z points is the same as number of steps from A point to Z point
|
|
|
|
(counterexamples, even with a single direction: `AAA -> BBB -> CCC -> ZZZ -> CCC -> ZZZ`;
|
|
|
|
`AAA -> BBB -> ZZZ -> CCC -> AAA -> BBB -> ZZZ`).
|
|
|
|
|
|
|
|
Turns out that all puzzle inputs are carefully crafted in such a way that all three assumptions hold,
|
|
|
|
so that a naive "lowest common multiple" answer happens to be correct for these inputs.
|
|
|
|
|
|
|
|
Yet nowhere does the text of the puzzle say that this should be the case.
|
|
|
|
|
|
|
|
And it seems that a lot of people did not even explicitly made these assumptions.
|
|
|
|
They just had an incorrect idea that LCM should be the solution, without understanding why...
|
|
|
|
and this incorrect idea just happened to produce the correct answer for this specific input.
|
|
|
|
|
|
|
|
I don't like making assumptions like these, and I prefer to make more or less universal solutions,
|
|
|
|
those that work on everybody's input (and even if my input had these magic properties,
|
|
|
|
there is no guarantee other inputs do).
|
|
|
|
|
|
|
|
The only assumptions I made in my solution are:
|
|
|
|
|
|
|
|
1. That there is an answer;
|
|
|
|
2. That the puzzle input contains at most 270 directions, at most 896 lines, and at most six entry points;
|
|
|
|
3. That the answer is below ~100 trillions (the most dangerous assumption).
|
|
|
|
|
|
|
|
That said, attempting to solve the task "with math" (as described above)
|
|
|
|
would quickly result in understanding that instead of 6^6 possible "start nodes -> end nodes" combinations,
|
|
|
|
only one is possible, meaning that only 270 configurations (instead of 10 millions) should be checked,
|
|
|
|
and only 3300 values (instead of 20k) have to be precomputed, with ~400k operations for each.
|
|
|
|
|
|
|
|
That would be the assumption 1 listed above ("from any xxA point, only one xxZ point can be reached"),
|
|
|
|
which is also the one easiest to check.
|