Code cleanup

Refactored macros to templates
38 changed files with 1874 additions and 363 deletions
--- a/README.md
+++ b/README.md
@ -3,6 +3,25 @@ Info

 This is my solution to the challenge: http://followthewhiterabbit.trustpilot.com/

+The task is to find anagrams of the phrase "**poultry outwits ants**" with the required MD5 hashes, using the supplied dictionary:
+
+```
+e4820b45d2277f3844eac66c903e84be # easy
+23170acc097c24edb98fc5488ab033fe # more difficult
+665e5bcb0c20062fe8abaaf4628bb154 # hard
+```
+
+And some more hashes for you to do:
+
+```
+e8a2cbb6206fc937082bb92e4ed9cd3d
+74a613b8c64fb216dc22d4f2bd4965f4
+ccb5ed231ba04d750c963668391d1e61
+d864ae0e66c89cb78345967cb2f3ab6b
+2b56477105d91076030e877c94dd9776
+732442feac8b5013e16a776486ac5447
+```
+
 Usage info
 ==========

@ -10,15 +29,129 @@ Usage info
 WhiteRabbit.exe < wordlist
 ```

+**Note that this code only works correctly on little-endian x64 systems, due to heavy optimizations of MD5 computation!**
+
 Performance
 ===========

-This solution is not optimized for multi-threading.
+Memory usage is minimal (for that kind of task), less than 10MB (25MB for MaxNumberOfWords = 8).
+
+It is also somewhat optimized for likely intended phrases, as anagrams consisting of longer words are generated first.
+That's why the given hashes are solved much sooner than it takes to check all anagrams.
+
+Anagrams generation is not parallelized, as even single-threaded performance for 4-word anagrams is high enough; and 5-word (or larger) anagrams are frequent enough for most of the time being spent on computing hashes, with full CPU load.
+
+Multi-threaded performance with RyuJIT (.NET 4.6, 64-bit system) on i5-6500 is as follows (excluding initialization time of 0.2 seconds), for different maximum allowed words in an anagram:
+
+Number of words|Time to check all anagrams no longer than that|Time to solve "easy" hash|Time to solve "more difficult" hash|Time to solve "hard" hash|Number of unique anagrams no longer than that
+---------------|----------------------------------------------|-------------------------|-----------------------------------|-------------------------|---------------------------------------------
+3|0.04s||||4560
+4|0.45s|||0.08s|7,431,984
+5|9.6s|0.15s|0.06s|0.27s|1,347,437,484
+6|4.5 minutes|0.85s|0.17s|2.05s|58,405,904,844
+7|83 minutes|4.7s|0.6s|13.3s|1,070,307,744,114
+8|14 hours|17.6s|1.8s|55s|10,893,594,396,594
+9||45s|4s|2.5 minutes|70,596,864,409,954
+10||80s|5.8s|4.8 minutes|314,972,701,475,754
+
+Note that all measurements were done on a Release build; Debug build is significantly slower.
+
+For comparison, certain other solutions available on GitHub seem to require 3 hours to find all 3-word anagrams. This solution is faster by 6-7 orders of magnitude (it finds and checks all 4-word anagrams in 1/10000th fraction of time required for other solution just to find all 3-word anagrams, with no MD5 calculations).
+
+Conditional compilation symbols
+===============================
+
+* Define `DEBUG`, or build in debug mode, to get the total number of anagrams (not optimized).
+
+Implementation notes
+====================
+
+1. We need to limit the number of words in an anagram by some reasonable number, as there are single-letter words in dictionary, and computing MD5 hashes for all anagrams consisting of single-letter words is computationally infeasible and could not have been intended by the challenge authors.
+In particular, as there are single-letter words for every letter in the original phrase, there are obvious anagrams consisting exclusively of the single-letter words; and the number of such anagrams equals to the number of all letter permutations of the original phrase, which is too high.
+
+2. Every word or phrase could be thought of as a vector in 26-dimensional space, with every component equal to the number of corresponding letters in the original word.
+That way, vector corresponding to some phrase equals to the sum of vectors of its words.
+We can reduce the problem of finding anagrams (words which add up to a phrase containing the same letters in the same quantity as the original phrase) to the problem of finding sequences of vectors which add up to the vector corresponding to the original phrase.
+Of course, several words could be represented by a single vector.
+So the first step is: convert words to vectors; find all sequences of vectors which add up to the required sum; convert sequences of vectors back to the sequences of words (with every sequence of vectors potentially generating many sequences of words).
+
+3. Of course, we can ignore all words that contain any letter besides that contained in the original phrase, or that contain too many copies of some letter.
+Basically, we only need to consider words which are represented by vectors with all components not greater than that of the vector corresponding to the original phrase.
+
+4. Vector ariphmetic could be done manually, but all modern processors have SIMD support of some sort, which allows for fast vector operations (addition, comparison etc).
+It seems that modern instruction set allows one to work with 128-bit vectors; and System.Numerics.Vectors allows us to tap on this feature by offering vectors with byte components in 16-dimensional space.
+As the original phrase only contains 12 different characters, it's more than enough for us.
+
+5. Any permutation of the words gives us another anagram; any permutation of vectors does not change their sum.
+So we can only consider the sequences of vectors which go in the order specified in the original dictionary (that is, numbers of their positions go in the ascending order), and then consider all permutations of sequences that have the required sum.
+As sequences having the required sum are quite rare, that will give us a speedup with the factor of n!, where n is the allowed number of vectors (see note 1).
+
+6. So far, the generation of sequences of vectors is quite simple.
+We recursively go through the dictionary, starting with the position of previous word, and checking if all the vectors so far add up to the target sum, until maximum allowed number of vectors is reached.
+One obvious optimization is: if some component of the partial sum is larger than the corresponding component of the target, there is no need to process this partial sequence further.
+
+7. Next question is, should we reorder the original dictionary?
+It is quite expected that, if longer (in a certain sense) words go first, we'll have less possible variants to check, as we'll reach the partial sum that could be discarded (see note 6) sooner.
+It turns out that we can get pretty noticeable speedup this way: total number of processed sequences goes down from 62 millions to 29 millions in a three-word case, and from 1468 millions to 311 millions in a four-word case.
+The ordering we use is as follows: every letter gets assigned a weight which is inversely proportional to the number of occurrences in the original phrase.
+This way, every component of the original phrase is weighed equally.
+Then, words get ordered by weight in a descending order.
+
+8. Note that such a weight acts like a norm on our non-negative pseudospace.
+What's more, it is a linear function, meaning that weight of sum of vectors equals sum of weights.
+It means that, if we're now checking a vector such that its weight, multiplied by a number of words we're ready to allow in the sequence, is less than the distance between current partial sum and a target, there is no point in checking sequences containing this word (or anything smaller) for this partial sequence.
+As we have ordered the words by weight, when we're looping over the dictionary, we can check the weight of the current item, and if it's lower than our threshold, we can just break the loop.
+
+9. Another possible optimization with such an ordering is employing binary search.
+There is no need in processing all the words that are too large to be useful at this moment; we can start with a first word with a weight not exceeding distance between current partial sum and the target.
+
+10. And then, all that remains are implementation optimizations: precomputing weights, optimizing memory usage and loops, using byte arrays instead of strings, etc. Some of optimizations which hurt code readability:
+    * Words are stored as byte arrays (one byte per character, as we're working with ASCII), with trailing space (to make concatenating words into anagram easier);
+    * Anagrams are stored in a way optimized for MD5 - as MD5 message (i.e. with trailing "128" byte, as an array of 8 uints, with last uint set to anagram length * 8). For example, "poultry outwits ants" is stored as fixed 32-byte memory area containing "poultry outwits ants" + 0x80 + (0x00)x7 + (uint)0x50 (for 20 characters).
+
+11. Filtering the original dictionary (e.g. throwing away all single-letter words) does not really improve the performance, thanks to the optimizations mentioned in notes 7-9.
+This solution finds all anagrams, including those with single-letter words.
+
+12. Computing the entire MD5, and then comparing it to the target MD5s, makes little sense. Each of MD5 components is `uint`, which means that the chances of first component match for different hashes are one in 4 billions.
+It's more efficient to compute only the first component (which is 5% faster since we don't need to perform rounds 62-64 of MD5), and use only the first component for a lookup (which makes the lookup 4x faster).
+To prevent false positives, we could compute the entire MD5 again if there is a match.
+As that will only happen once in 4 billion hashes, the efficiency of this computation does not matter at all.
+Right now, this additional checking is not implemented, which means that once in a minute (if there are 3 target hashes) the program will produce a false positive, which allows one to monitor progress.
+
+13. MD5 computation is further optimized by leveraging CPU extensions.
+For example, one could compute MD5 more effectively by using `rotl` instruction to rotate numbers (which is currently done with two bitshifts and one `or` / `xor`).
+What's more important, one could compute 4 hashes at once (on a single core) using SSE, 8 hashes at once using AVX2, or 16 hashes at once using AVX512 (AVX lacks enough instructions to make computing hashes feasible).
+.NET/RyuJit does not support some of the required intrinsics (`rotl` for plain MD5 implementation, `psrld` and `pslld` for SSE, and similar intrinsics for AVX2).
+Although `rotl` support is expected in next release of RyuJIT (see https://github.com/dotnet/coreclr/pull/1830), no support for bitshift SIMD/AVX2 instructions is currently expected (see https://github.com/dotnet/coreclr/issues/3226).
+However, one can move MD5 computations to the unmanaged C++ code, where all the intrinsics are available.
+To make this work efficiently, I had to store anagrams in chunks of 8 anagrams (so that unmanaged code will receive the chunk and produce 8 hashes).
+And to make this efficient, I had to make all permutation counts to divide by 8 by filling in some additional permutation copies.
+It slows down processing anagrams of 1, 2, and 3 words (as for every set of word, number of anagrams is increased to 8 from 1, 2 and 6, respectively); however, these are relatively rare for a given phrase and dictionary.
+
+Implementation details
+======================
+
+Given all the above, the implementation is as follows:
+
+1. Words from the dictionary are converted into arrays of bytes with a trailing space.
+
+2. The dictionary is filtered from words that could not be a part of anagram (e.g. "b" or "aa"), and from duplicates.
+
+3. Words are converted into vectors, and grouped by vector.
+
+4. Vectors are ordered by their norm, in a descending order.
+
+5. All sequences of non-decreasing vector indices adding up to a target vector are found.
+
+6. For every sequence, a sequence of word arrays corresponging to these vectors is generated.
+
+7. For every sequence of word arrays, all sequences of word combinations are generated (e.g. for [[ab, ba], [cd, dc]], we generate [ab, cd], [ab, dc], [ba, cd], [ba, dc]).
+
+8. For every sequence of words, all permutations are generated (in chunks of 8).

-Nevertheless, the performance on Sandy Bridge @2.8GHz is as follows:
+9. For every 8 permuted sequences of words, `uint[64]` message is generated (8 uints = 28 bytes with a trailing `128` byte, plus a length in bits for every sequence).

-* If only phrases of at most 3 words are allowed, then it takes 2.5 seconds to find and check all anagrams; all relevant hashes are solved in first 0.4 seconds;
+10. For every `uint[64]` message, 8 `uint`s corresponding to the first components of MD5 hashes for `uint[8]` messages are generated.

-* If phrases of 4 words are allowed as well, then it takes 70 seconds to find and check all anagrams; all hashes are solved in first 5 seconds;
+11. Every resulting `uint` is checked against the targets; if match is found, both sequence of word and full MD5 hash are printed to the output.

-For comparison, certain other solutions available on GitHub seem to require 3 hours to find all 3-word anagrams (i.e. this solution is faster by a factor of 4000 in 3-word case).
--- a/TrustPilotChallenge.sln
+++ b/TrustPilotChallenge.sln
@ -1,22 +0,0 @@
-
-Microsoft Visual Studio Solution File, Format Version 12.00
-# Visual Studio 14
-VisualStudioVersion = 14.0.24720.0
-MinimumVisualStudioVersion = 10.0.40219.1
-Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "WhiteRabbit", "WhiteRabbit\WhiteRabbit.csproj", "{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}"
-EndProject
-Global
-	GlobalSection(SolutionConfigurationPlatforms) = preSolution
-		Debug|Any CPU = Debug|Any CPU
-		Release|Any CPU = Release|Any CPU
-	EndGlobalSection
-	GlobalSection(ProjectConfigurationPlatforms) = postSolution
-		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
-		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|Any CPU.Build.0 = Debug|Any CPU
-		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|Any CPU.ActiveCfg = Release|Any CPU
-		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|Any CPU.Build.0 = Release|Any CPU
-	EndGlobalSection
-	GlobalSection(SolutionProperties) = preSolution
-		HideSolutionNode = FALSE
-	EndGlobalSection
-EndGlobal
--- a/WhiteRabbit/App.config
+++ b/WhiteRabbit/App.config
@ -1,6 +0,0 @@
-<?xml version="1.0" encoding="utf-8" ?>
-<configuration>
-    <startup> 
-        <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.6" />
-    </startup>
-</configuration>
--- a/WhiteRabbit/PrecomputedPermutationsGenerator.cs
+++ b/WhiteRabbit/PrecomputedPermutationsGenerator.cs
@ -1,37 +0,0 @@
-namespace WhiteRabbit
-{
-    using System.Collections.Generic;
-    using System.Linq;
-
-    internal static class PrecomputedPermutationsGenerator
-    {
-        private static PermutationsGenerator.Permutation[] Permutations1 { get; } = PermutationsGenerator.HamiltonianPermutations(1).ToArray();
-
-        private static PermutationsGenerator.Permutation[] Permutations2 { get; } = PermutationsGenerator.HamiltonianPermutations(2).ToArray();
-
-        private static PermutationsGenerator.Permutation[] Permutations3 { get; } = PermutationsGenerator.HamiltonianPermutations(3).ToArray();
-
-        private static PermutationsGenerator.Permutation[] Permutations4 { get; } = PermutationsGenerator.HamiltonianPermutations(4).ToArray();
-
-        private static PermutationsGenerator.Permutation[] Permutations5 { get; } = PermutationsGenerator.HamiltonianPermutations(5).ToArray();
-
-        public static IEnumerable<PermutationsGenerator.Permutation> HamiltonianPermutations(int n)
-        {
-            switch (n)
-            {
-                case 1:
-                    return Permutations1;
-                case 2:
-                    return Permutations2;
-                case 3:
-                    return Permutations3;
-                case 4:
-                    return Permutations4;
-                case 5:
-                    return Permutations5;
-                default:
-                    return PermutationsGenerator.HamiltonianPermutations(n);
-            }
-        }
-    }
-}
--- a/WhiteRabbit/Program.cs
+++ b/WhiteRabbit/Program.cs
@ -1,76 +0,0 @@
-namespace WhiteRabbit
-{
-    using System;
-    using System.Collections.Generic;
-    using System.Diagnostics;
-    using System.Linq;
-    using System.Numerics;
-    using System.Security.Cryptography;
-    using System.Text;
-
-    /// <summary>
-    /// Main class
-    /// </summary>
-    public static class Program
-    {
-        /// <summary>
-        /// Main entry point
-        /// </summary>
-        public static void Main()
-        {
-            var stopwatch = new Stopwatch();
-            stopwatch.Start();
-
-            var processor = new StringsProcessor("poultry outwits ants", 4);
-            var expectedHashes = new[]
-            {
-                "e4820b45d2277f3844eac66c903e84be",
-                "23170acc097c24edb98fc5488ab033fe",
-                "665e5bcb0c20062fe8abaaf4628bb154",
-            };
-
-            var expectedHashesAsVectors = new HashSet<Vector<byte>>(expectedHashes.Select(hash => new Vector<byte>(StringToByteArray(hash))));
-
-            foreach (var result in AddHashes(processor.GeneratePhrases(ReadInput())))
-            {
-                if (expectedHashesAsVectors.Contains(result.Item2))
-                {
-                    Console.WriteLine($"Found phrase: {result.Item1} (spent {stopwatch.Elapsed})");
-                }
-            }
-
-            stopwatch.Stop();
-            Console.WriteLine($"Total time spent: {stopwatch.Elapsed}");
-        }
-
-        // Code taken from http://stackoverflow.com/a/321404/831314
-        private static byte[] StringToByteArray(string hex)
-        {
-            return Enumerable.Range(0, hex.Length)
-                             .Where(x => x % 2 == 0)
-                             .Select(x => Convert.ToByte(hex.Substring(x, 2), 16))
-                             .ToArray();
-        }
-
-        private static IEnumerable<Tuple<string, Vector<byte>>> AddHashes(IEnumerable<string> input)
-        {
-            using (MD5 hasher = MD5.Create())
-            {
-                foreach (var line in input)
-                {
-                    var data = hasher.ComputeHash(Encoding.ASCII.GetBytes(line));
-                    yield return Tuple.Create(line, new Vector<byte>(data));
-                }
-            }
-        }
-
-        private static IEnumerable<string> ReadInput()
-        {
-            string line;
-            while ((line = Console.ReadLine()) != null)
-            {
-                yield return line;
-            }
-        }
-    }
-}
--- a/WhiteRabbit/StringsProcessor.cs
+++ b/WhiteRabbit/StringsProcessor.cs
@ -1,59 +0,0 @@
-namespace WhiteRabbit
-{
-    using System.Collections.Generic;
-    using System.Collections.Immutable;
-    using System.Linq;
-
-    internal class StringsProcessor
-    {
-        public StringsProcessor(string sourceString, int maxWordsCount)
-        {
-            var filteredSource = new string(sourceString.Where(ch => ch != ' ').ToArray());
-            this.VectorsConverter = new VectorsConverter(filteredSource);
-            this.VectorsProcessor = new VectorsProcessor(
-                this.VectorsConverter.GetVector(filteredSource).Value,
-                maxWordsCount,
-                this.VectorsConverter.GetString);
-        }
-
-        private VectorsConverter VectorsConverter { get; }
-
-        private VectorsProcessor VectorsProcessor { get; }
-
-        public IEnumerable<string> GeneratePhrases(IEnumerable<string> words)
-        {
-            // Dictionary of vectors to array of words represented by this vector
-            var formattedWords = words
-                .Distinct()
-                .Select(word => new { word, vector = this.VectorsConverter.GetVector(word) })
-                .Where(tuple => tuple.vector != null)
-                .Select(tuple => new { tuple.word, vector = tuple.vector.Value })
-                .GroupBy(tuple => tuple.vector)
-                .ToDictionary(group => group.Key, group => group.Select(tuple => tuple.word).ToArray());
-
-            // task of finding anagrams could be reduced to the task of finding sequences of dictionary vectors with the target sum
-            var sums = this.VectorsProcessor.GenerateSequences(formattedWords.Keys);
-
-            // converting sequences of vectors to the sequences of words...
-            var anagramsWords = sums
-                .Select(sum => ImmutableStack.Create(sum.Select(vector => formattedWords[vector]).ToArray()))
-                .SelectMany(this.Flatten)
-                .Select(stack => stack.ToArray());
-
-            return anagramsWords.Select(list => string.Join(" ", list));
-        }
-
-        // Converts e.g. pair of variants [[a, b, c], [d, e]] into all possible pairs: [[a, d], [a, e], [b, d], [b, e], [c, d], [c, e]]
-        private IEnumerable<ImmutableStack<T>> Flatten<T>(ImmutableStack<T[]> phrase)
-        {
-            if (phrase.IsEmpty)
-            {
-                return new[] { ImmutableStack.Create<T>() };
-            }
-
-            T[] wordVariants;
-            var newStack = phrase.Pop(out wordVariants);
-            return this.Flatten(newStack).SelectMany(remainder => wordVariants.Select(word => remainder.Push(word)));
-        }
-    }
-}
--- a/WhiteRabbit/VectorsProcessor.cs
+++ b/WhiteRabbit/VectorsProcessor.cs
@ -1,139 +0,0 @@
-namespace WhiteRabbit
-{
-    using System;
-    using System.Collections.Generic;
-    using System.Collections.Immutable;
-    using System.Diagnostics;
-    using System.Linq;
-    using System.Numerics;
-
-    internal class VectorsProcessor
-    {
-        public VectorsProcessor(Vector<byte> target, int maxVectorsCount, Func<Vector<byte>, string> vectorToString)
-        {
-            this.Target = target;
-            this.MaxVectorsCount = maxVectorsCount;
-            this.VectorToString = vectorToString;
-        }
-
-        /// <summary>
-        /// Negative sign bit.
-        /// (byte)b &amp; (byte)128 equals zero for non-negative (0..127) bytes and equals (byte)128 for negative (128..255) bytes.
-        /// Similarly, vector &amp; Negative equals zero if all bytes are non-negative, and does not equal zero if some bytes are negative.
-        /// Use <code>(vector &amp; Negative) == Vector&lt;byte&gt;.Zero</code> to determine if all components are non-negative.
-        /// </summary>
-        private static Vector<byte> Negative { get; } = new Vector<byte>(Enumerable.Repeat((byte)128, 16).ToArray());
-
-        private Vector<byte> Target { get; }
-
-        private int MaxVectorsCount { get; }
-
-        private Func<Vector<byte>, string> VectorToString { get; }
-
-        private long Iterations { get; set; } = 0;
-
-        // Produces all sequences of vectors with the target sum
-        public IEnumerable<Vector<byte>[]> GenerateSequences(IEnumerable<Vector<byte>> vectors)
-        {
-            var filteredVectors = this.FilterVectors(vectors);
-            var dictionary = ImmutableStack.Create(filteredVectors.ToArray());
-            var unorderedSequences = this.GenerateUnorderedSequences(this.Target, ImmutableStack.Create<Vector<byte>>(), dictionary);
-            var allSequences = unorderedSequences.SelectMany(this.GeneratePermutations);
-
-            return allSequences;
-        }
-
-        // We want words with more letters (and among these, words with more "rare" letters) to appear first, to reduce the searching time somewhat.
-        // Applying such a sort, we reduce the total number of triplets to check for anagrams from ~62M to ~29M.
-        // Total number of quadruplets is reduced from 1468M to mere 311M.
-        // Also, it produces the intended results faster (as these are more likely to contain longer words - e.g. "poultry outwits ants" is more likely than "p o u l t r y o u t w i t s a n t s").
-        // This method basically gives us the 1-norm of the vector in the space rescaled so that the target is [1, 1, ..., 1].
-        private int GetVectorWeight(Vector<byte> vector)
-        {
-            var weight = 0;
-            for (var i = 0; this.Target[i] != 0; i++)
-            {
-                weight += (720 * vector[i]) / this.Target[i]; // 720 = 6!, so that the result will be a whole number (unless Target[i] > 6)
-            }
-
-            return weight;
-        }
-
-        private IEnumerable<Vector<byte>> FilterVectors(IEnumerable<Vector<byte>> vectors)
-        {
-            return vectors
-                .Where(vector => ((this.Target - vector) & Negative) == Vector<byte>.Zero)
-                .OrderBy(GetVectorWeight);
-        }
-
-        [Conditional("DEBUG")]
-        private void DebugState(ImmutableStack<Vector<byte>> partialSumStack, Vector<byte> currentVector)
-        {
-            this.Iterations++;
-            if (this.Iterations % 1000000 == 0)
-            {
-                Console.WriteLine($"Iteration #{this.Iterations}: {string.Join(" ", partialSumStack.Push(currentVector).Reverse().Select(vector => this.VectorToString(vector)))}");
-            }
-        }
-
-        // This method takes most of the time, so everything related to it must be optimized.
-        // In every sequence, next vector always goes after the previous one from dictionary.
-        // E.g. if dictionary is [x, y, z], then only [x, y] sequence could be generated, and [y, x] will never be generated.
-        // That way, the complexity of search goes down by a factor of MaxVectorsCount! (as if [x, y] does not add up to a required target, there is no point in checking [y, x])
-        private IEnumerable<Vector<byte>[]> GenerateUnorderedSequences(Vector<byte> remainder, ImmutableStack<Vector<byte>> partialSumStack, ImmutableStack<Vector<byte>> dictionaryStack)
-        {
-            var count = partialSumStack.Count() + 1;
-            if (count < this.MaxVectorsCount)
-            {
-                var dictionaryTail = dictionaryStack;
-                while (!dictionaryTail.IsEmpty)
-                {
-                    Vector<byte> currentVector;
-                    var nextDictionaryTail = dictionaryTail.Pop(out currentVector);
-
-                    this.DebugState(partialSumStack, currentVector);
-
-                    var newRemainder = remainder - currentVector;
-                    if (newRemainder == Vector<byte>.Zero)
-                    {
-                        yield return partialSumStack.Push(currentVector).Reverse().ToArray();
-                    }
-                    else if ((newRemainder & Negative) == Vector<byte>.Zero)
-                    {
-                        foreach (var result in this.GenerateUnorderedSequences(newRemainder, partialSumStack.Push(currentVector), dictionaryTail))
-                        {
-                            yield return result;
-                        }
-                    }
-
-                    dictionaryTail = nextDictionaryTail;
-                }
-            }
-            else if (count == this.MaxVectorsCount)
-            {
-                var dictionaryTail = dictionaryStack;
-                while (!dictionaryTail.IsEmpty)
-                {
-                    Vector<byte> currentVector;
-                    dictionaryTail = dictionaryTail.Pop(out currentVector);
-
-                    this.DebugState(partialSumStack, currentVector);
-
-                    var newRemainder = remainder - currentVector;
-                    if (newRemainder == Vector<byte>.Zero)
-                    {
-                        yield return partialSumStack.Push(currentVector).Reverse().ToArray();
-                    }
-                }
-            }
-        }
-
-        private IEnumerable<T[]> GeneratePermutations<T>(T[] original)
-        {
-            foreach (var permutation in PrecomputedPermutationsGenerator.HamiltonianPermutations(original.Length))
-            {
-                yield return permutation.Select(i => original[i]).ToArray();
-            }
-        }
-    }
-}
--- a/dotnet/.gitattributes
+++ b/dotnet/.gitattributes
--- a/dotnet/.gitignore
+++ b/dotnet/.gitignore
--- a/dotnet/TrustPilotChallenge.sln
+++ b/dotnet/TrustPilotChallenge.sln
@ -0,0 +1,47 @@
+
+Microsoft Visual Studio Solution File, Format Version 12.00
+# Visual Studio 15
+VisualStudioVersion = 15.0.26403.3
+MinimumVisualStudioVersion = 10.0.40219.1
+Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "WhiteRabbit", "WhiteRabbit\WhiteRabbit.csproj", "{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}"
+EndProject
+Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "WhiteRabbit.UnmanagedBridge", "WhiteRabbit.UnmanagedBridge\WhiteRabbit.UnmanagedBridge.vcxproj", "{039F03A0-7E8F-415D-8180-969D24479B44}"
+EndProject
+Global
+	GlobalSection(SolutionConfigurationPlatforms) = preSolution
+		Debug|Any CPU = Debug|Any CPU
+		Debug|x64 = Debug|x64
+		Debug|x86 = Debug|x86
+		Release|Any CPU = Release|Any CPU
+		Release|x64 = Release|x64
+		Release|x86 = Release|x86
+	EndGlobalSection
+	GlobalSection(ProjectConfigurationPlatforms) = postSolution
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|Any CPU.Build.0 = Debug|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|x64.ActiveCfg = Debug|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|x64.Build.0 = Debug|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|x86.ActiveCfg = Debug|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|x86.Build.0 = Debug|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|Any CPU.ActiveCfg = Release|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|Any CPU.Build.0 = Release|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|x64.ActiveCfg = Release|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|x64.Build.0 = Release|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|x86.ActiveCfg = Release|Any CPU
+		{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|x86.Build.0 = Release|Any CPU
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|Any CPU.ActiveCfg = Debug|Win32
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|x64.ActiveCfg = Debug|x64
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|x64.Build.0 = Debug|x64
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|x86.ActiveCfg = Debug|Win32
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|x86.Build.0 = Debug|Win32
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Release|Any CPU.ActiveCfg = Release|x64
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Release|Any CPU.Build.0 = Release|x64
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Release|x64.ActiveCfg = Release|x64
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Release|x64.Build.0 = Release|x64
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Release|x86.ActiveCfg = Release|Win32
+		{039F03A0-7E8F-415D-8180-969D24479B44}.Release|x86.Build.0 = Release|Win32
+	EndGlobalSection
+	GlobalSection(SolutionProperties) = preSolution
+		HideSolutionNode = FALSE
+	EndGlobalSection
+EndGlobal
--- a/dotnet/WhiteRabbit.UnmanagedBridge/AssemblyInfo.cpp
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/AssemblyInfo.cpp
@ -0,0 +1,38 @@
+#include "stdafx.h"
+
+using namespace System;
+using namespace System::Reflection;
+using namespace System::Runtime::CompilerServices;
+using namespace System::Runtime::InteropServices;
+using namespace System::Security::Permissions;
+
+//
+// General Information about an assembly is controlled through the following
+// set of attributes. Change these attribute values to modify the information
+// associated with an assembly.
+//
+[assembly:AssemblyTitleAttribute(L"WhiteRabbitUnmanagedBridge")];
+[assembly:AssemblyDescriptionAttribute(L"")];
+[assembly:AssemblyConfigurationAttribute(L"")];
+[assembly:AssemblyCompanyAttribute(L"")];
+[assembly:AssemblyProductAttribute(L"WhiteRabbitUnmanagedBridge")];
+[assembly:AssemblyCopyrightAttribute(L"Copyright (c)  2017")];
+[assembly:AssemblyTrademarkAttribute(L"")];
+[assembly:AssemblyCultureAttribute(L"")];
+
+//
+// Version information for an assembly consists of the following four values:
+//
+//      Major Version
+//      Minor Version
+//      Build Number
+//      Revision
+//
+// You can specify all the value or you can default the Revision and Build Numbers
+// by using the '*' as shown below:
+
+[assembly:AssemblyVersionAttribute("1.0.*")];
+
+[assembly:ComVisible(false)];
+
+[assembly:CLSCompliantAttribute(true)];
--- a/dotnet/WhiteRabbit.UnmanagedBridge/Stdafx.cpp
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/Stdafx.cpp
@ -0,0 +1,5 @@
+// stdafx.cpp : source file that includes just the standard includes
+// WhiteRabbit.Unmanaged.pch will be the pre-compiled header
+// stdafx.obj will contain the pre-compiled type information
+
+#include "stdafx.h"
--- a/dotnet/WhiteRabbit.UnmanagedBridge/Stdafx.h
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/Stdafx.h
@ -0,0 +1,7 @@
+// stdafx.h : include file for standard system include files,
+// or project specific include files that are used frequently,
+// but are changed infrequently
+
+#pragma once
+
+
--- a/dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.cpp
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.cpp
@ -0,0 +1,32 @@
+// This is the main DLL file.
+
+#include "stdafx.h"
+
+#include "WhiteRabbit.UnmanagedBridge.h"
+#include "md5.h"
+#include "phraseset.h"
+
+void WhiteRabbitUnmanagedBridge::MD5Unmanaged::ComputeMD5(unsigned __int32 * input, unsigned __int32 * expected)
+{
+#if AVX2
+    md5(input + 0 * 8 * 8, expected);
+#elif SIMD
+    md5(input + 0 * 8 * 4);
+    md5(input + 1 * 8 * 4);
+    if (input[2 * 8 * 4] != 0)
+    {
+        md5(input + 2 * 8 * 4);
+        md5(input + 3 * 8 * 4);
+    }
+#else
+    for (int i = 0; i < 16; i++)
+    {
+        md5(input + i * 8);
+    }
+#endif
+}
+
+void WhiteRabbitUnmanagedBridge::MD5Unmanaged::FillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords)
+{
+    fillPhraseSet(initialBufferPointer, bufferPointer, allWordsPointer, wordIndexes, permutationsPointer, numberOfWords);
+}
--- a/dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.h
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.h
@ -0,0 +1,18 @@
+// WhiteRabbit.Unmanaged.h
+
+#pragma once
+
+#include "constants.h"
+
+using namespace System;
+
+namespace WhiteRabbitUnmanagedBridge {
+
+	public ref class MD5Unmanaged
+	{
+        public:
+            literal int PhrasesPerSet = PHRASES_PER_SET;
+            static void ComputeMD5(unsigned int* input, unsigned __int32 * expected);
+            static void FillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords);
+	};
+}
--- a/dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.vcxproj
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.vcxproj
@ -0,0 +1,158 @@
+<?xml version="1.0" encoding="utf-8"?>
+<Project DefaultTargets="Build" ToolsVersion="14.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
+  <ItemGroup Label="ProjectConfigurations">
+    <ProjectConfiguration Include="Debug|Win32">
+      <Configuration>Debug</Configuration>
+      <Platform>Win32</Platform>
+    </ProjectConfiguration>
+    <ProjectConfiguration Include="Release|Win32">
+      <Configuration>Release</Configuration>
+      <Platform>Win32</Platform>
+    </ProjectConfiguration>
+    <ProjectConfiguration Include="Debug|x64">
+      <Configuration>Debug</Configuration>
+      <Platform>x64</Platform>
+    </ProjectConfiguration>
+    <ProjectConfiguration Include="Release|x64">
+      <Configuration>Release</Configuration>
+      <Platform>x64</Platform>
+    </ProjectConfiguration>
+  </ItemGroup>
+  <PropertyGroup Label="Globals">
+    <ProjectGuid>{039F03A0-7E8F-415D-8180-969D24479B44}</ProjectGuid>
+    <TargetFrameworkVersion>v4.7</TargetFrameworkVersion>
+    <Keyword>ManagedCProj</Keyword>
+    <RootNamespace>WhiteRabbitUnmanagedBridge</RootNamespace>
+    <WindowsTargetPlatformVersion>10.0.10586.0</WindowsTargetPlatformVersion>
+  </PropertyGroup>
+  <Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
+    <ConfigurationType>DynamicLibrary</ConfigurationType>
+    <UseDebugLibraries>true</UseDebugLibraries>
+    <PlatformToolset>v141</PlatformToolset>
+    <CLRSupport>true</CLRSupport>
+    <CharacterSet>Unicode</CharacterSet>
+  </PropertyGroup>
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
+    <ConfigurationType>DynamicLibrary</ConfigurationType>
+    <UseDebugLibraries>false</UseDebugLibraries>
+    <PlatformToolset>v141</PlatformToolset>
+    <CLRSupport>true</CLRSupport>
+    <CharacterSet>Unicode</CharacterSet>
+  </PropertyGroup>
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
+    <ConfigurationType>DynamicLibrary</ConfigurationType>
+    <UseDebugLibraries>true</UseDebugLibraries>
+    <PlatformToolset>v141</PlatformToolset>
+    <CLRSupport>true</CLRSupport>
+    <CharacterSet>Unicode</CharacterSet>
+  </PropertyGroup>
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
+    <ConfigurationType>DynamicLibrary</ConfigurationType>
+    <UseDebugLibraries>false</UseDebugLibraries>
+    <PlatformToolset>v141</PlatformToolset>
+    <CLRSupport>true</CLRSupport>
+    <CharacterSet>Unicode</CharacterSet>
+  </PropertyGroup>
+  <Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
+  <ImportGroup Label="ExtensionSettings">
+  </ImportGroup>
+  <ImportGroup Label="Shared">
+  </ImportGroup>
+  <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
+    <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
+  </ImportGroup>
+  <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
+    <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
+  </ImportGroup>
+  <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
+    <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
+  </ImportGroup>
+  <ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
+    <Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
+  </ImportGroup>
+  <PropertyGroup Label="UserMacros" />
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
+    <LinkIncremental>true</LinkIncremental>
+  </PropertyGroup>
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
+    <LinkIncremental>true</LinkIncremental>
+  </PropertyGroup>
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
+    <LinkIncremental>false</LinkIncremental>
+  </PropertyGroup>
+  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
+    <LinkIncremental>false</LinkIncremental>
+  </PropertyGroup>
+  <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
+    <ClCompile>
+      <WarningLevel>Level3</WarningLevel>
+      <Optimization>Disabled</Optimization>
+      <PreprocessorDefinitions>WIN32;_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>
+      <PrecompiledHeader>Use</PrecompiledHeader>
+    </ClCompile>
+    <Link>
+      <AdditionalDependencies />
+    </Link>
+  </ItemDefinitionGroup>
+  <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
+    <ClCompile>
+      <WarningLevel>Level3</WarningLevel>
+      <Optimization>Disabled</Optimization>
+      <PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>
+      <PrecompiledHeader>Use</PrecompiledHeader>
+    </ClCompile>
+    <Link>
+      <AdditionalDependencies />
+    </Link>
+  </ItemDefinitionGroup>
+  <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
+    <ClCompile>
+      <WarningLevel>Level3</WarningLevel>
+      <PreprocessorDefinitions>WIN32;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>
+      <PrecompiledHeader>Use</PrecompiledHeader>
+    </ClCompile>
+    <Link>
+      <AdditionalDependencies />
+    </Link>
+  </ItemDefinitionGroup>
+  <ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
+    <ClCompile>
+      <WarningLevel>Level3</WarningLevel>
+      <PreprocessorDefinitions>SIMD=true;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>
+      <PrecompiledHeader>Use</PrecompiledHeader>
+      <Optimization>Full</Optimization>
+      <InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
+      <IntrinsicFunctions>true</IntrinsicFunctions>
+      <FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
+      <AssemblerOutput>AssemblyAndSourceCode</AssemblerOutput>
+      <EnableEnhancedInstructionSet>StreamingSIMDExtensions2</EnableEnhancedInstructionSet>
+    </ClCompile>
+    <Link>
+      <AdditionalDependencies />
+    </Link>
+  </ItemDefinitionGroup>
+  <ItemGroup>
+    <ClInclude Include="constants.h" />
+    <ClInclude Include="md5.h" />
+    <ClInclude Include="phraseset.h" />
+    <ClInclude Include="resource.h" />
+    <ClInclude Include="Stdafx.h" />
+    <ClInclude Include="WhiteRabbit.UnmanagedBridge.h" />
+  </ItemGroup>
+  <ItemGroup>
+    <ClCompile Include="AssemblyInfo.cpp" />
+    <ClCompile Include="md5.cpp" />
+    <ClCompile Include="phraseset.cpp" />
+    <ClCompile Include="Stdafx.cpp">
+      <PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">Create</PrecompiledHeader>
+      <PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">Create</PrecompiledHeader>
+      <PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">Create</PrecompiledHeader>
+      <PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Release|x64'">Create</PrecompiledHeader>
+    </ClCompile>
+    <ClCompile Include="WhiteRabbit.UnmanagedBridge.cpp" />
+  </ItemGroup>
+  <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
+  <ImportGroup Label="ExtensionTargets">
+  </ImportGroup>
+</Project>
--- a/dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.vcxproj.filters
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.vcxproj.filters
@ -0,0 +1,50 @@
+<?xml version="1.0" encoding="utf-8"?>
+<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
+  <ItemGroup>
+    <Filter Include="Source Files">
+      <UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>
+      <Extensions>cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx</Extensions>
+    </Filter>
+    <Filter Include="Header Files">
+      <UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
+      <Extensions>h;hh;hpp;hxx;hm;inl;inc;xsd</Extensions>
+    </Filter>
+  </ItemGroup>
+  <ItemGroup>
+    <ClInclude Include="Stdafx.h">
+      <Filter>Header Files</Filter>
+    </ClInclude>
+    <ClInclude Include="resource.h">
+      <Filter>Header Files</Filter>
+    </ClInclude>
+    <ClInclude Include="WhiteRabbit.UnmanagedBridge.h">
+      <Filter>Header Files</Filter>
+    </ClInclude>
+    <ClInclude Include="md5.h">
+      <Filter>Header Files</Filter>
+    </ClInclude>
+    <ClInclude Include="constants.h">
+      <Filter>Header Files</Filter>
+    </ClInclude>
+    <ClInclude Include="phraseset.h">
+      <Filter>Header Files</Filter>
+    </ClInclude>
+  </ItemGroup>
+  <ItemGroup>
+    <ClCompile Include="AssemblyInfo.cpp">
+      <Filter>Source Files</Filter>
+    </ClCompile>
+    <ClCompile Include="Stdafx.cpp">
+      <Filter>Source Files</Filter>
+    </ClCompile>
+    <ClCompile Include="WhiteRabbit.UnmanagedBridge.cpp">
+      <Filter>Source Files</Filter>
+    </ClCompile>
+    <ClCompile Include="md5.cpp">
+      <Filter>Source Files</Filter>
+    </ClCompile>
+    <ClCompile Include="phraseset.cpp">
+      <Filter>Source Files</Filter>
+    </ClCompile>
+  </ItemGroup>
+</Project>
--- a/dotnet/WhiteRabbit.UnmanagedBridge/constants.h
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/constants.h
@ -0,0 +1,3 @@
+#pragma once
+
+#define PHRASES_PER_SET 16
--- a/dotnet/WhiteRabbit.UnmanagedBridge/md5.cpp
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/md5.cpp
@ -0,0 +1,236 @@
+#include "stdafx.h"
+#include "md5.h"
+
+#include "intrin.h"
+
+#pragma unmanaged
+
+struct MD5Vector
+{
+    __m256i m_V0;
+    __m256i m_V1;
+    __forceinline MD5Vector() {}
+    __forceinline MD5Vector(__m256i C0, __m256i C1) :m_V0(C0), m_V1(C1) {}
+
+    __forceinline MD5Vector MXor(MD5Vector R) const
+    {
+        return MD5Vector(_mm256_xor_si256(m_V0, R.m_V0), _mm256_xor_si256(m_V1, R.m_V1));
+    }
+
+    __forceinline MD5Vector MAnd(MD5Vector R) const
+    {
+        return MD5Vector(_mm256_and_si256(m_V0, R.m_V0), _mm256_and_si256(m_V1, R.m_V1));
+    }
+
+    __forceinline MD5Vector MAndNot(MD5Vector R) const
+    {
+        return MD5Vector(_mm256_andnot_si256(m_V0, R.m_V0), _mm256_andnot_si256(m_V1, R.m_V1));
+    }
+
+    __forceinline const MD5Vector MOr(const MD5Vector R) const
+    {
+        return MD5Vector(_mm256_or_si256(m_V0, R.m_V0), _mm256_or_si256(m_V1, R.m_V1));
+    }
+
+    __forceinline const MD5Vector MAdd(const MD5Vector R) const
+    {
+        return MD5Vector(_mm256_add_epi32(m_V0, R.m_V0), _mm256_add_epi32(m_V1, R.m_V1));
+    }
+
+    __forceinline const MD5Vector MShiftLeft(const int shift) const
+    {
+        return MD5Vector(_mm256_slli_epi32(m_V0, shift), _mm256_slli_epi32(m_V1, shift));
+    }
+
+    __forceinline const MD5Vector MShiftRight(const int shift) const
+    {
+        return MD5Vector(_mm256_srli_epi32(m_V0, shift), _mm256_srli_epi32(m_V1, shift));
+    }
+
+    template<int imm8>
+    __forceinline const MD5Vector Permute() const
+    {
+        return MD5Vector(_mm256_permute4x64_epi64(m_V0, imm8), _mm256_permute4x64_epi64(m_V1, imm8));
+    }
+
+    __forceinline const MD5Vector CompareEquality32(const __m256i other) const
+    {
+        return MD5Vector(_mm256_cmpeq_epi32(m_V0, other), _mm256_cmpeq_epi32(m_V1, other));
+    }
+
+    __forceinline void WriteMoveMask8(__int32 * output) const
+    {
+        output[0] = _mm256_movemask_epi8(m_V0);
+        output[1] = _mm256_movemask_epi8(m_V1);
+    }
+};
+
+__forceinline const MD5Vector OP_XOR(const MD5Vector a, const MD5Vector b) { return a.MXor(b); }
+__forceinline const MD5Vector OP_AND(const MD5Vector a, const MD5Vector b) { return a.MAnd(b); }
+__forceinline const MD5Vector OP_ANDNOT(const MD5Vector a, const MD5Vector b) { return a.MAndNot(b); }
+__forceinline const MD5Vector OP_OR(const MD5Vector a, const MD5Vector b) { return a.MOr(b); }
+__forceinline const MD5Vector OP_ADD(const MD5Vector a, const MD5Vector b) { return a.MAdd(b); }
+template<int r>
+__forceinline const MD5Vector OP_ROT(const MD5Vector a) { return OP_OR(a.MShiftLeft(r), a.MShiftRight(32 - (r))); }
+__forceinline const MD5Vector OP_BLEND(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_OR(OP_AND(x, b), OP_ANDNOT(x, a)); }
+
+__forceinline const MD5Vector CREATE_VECTOR(const int a) { return MD5Vector(_mm256_set1_epi32(a), _mm256_set1_epi32(a)); }
+__forceinline const MD5Vector CREATE_VECTOR_FROM_INPUT(const unsigned __int32* input, const size_t offset)
+{
+    return MD5Vector(
+        _mm256_i32gather_epi32((int*)(input + offset), _mm256_set_epi32(7 * 8, 6 * 8, 5 * 8, 4 * 8, 3 * 8, 2 * 8, 1 * 8, 0 * 8), 4),
+        _mm256_i32gather_epi32((int*)(input + offset), _mm256_set_epi32(15 * 8, 14 * 8, 13 * 8, 12 * 8, 11 * 8, 10 * 8, 9 * 8, 8 * 8), 4));
+}
+
+#define WRITE_TO_OUTPUT(a, output, expected) \
+    a.Permute<0 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output); \
+    a.Permute<1 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 2); \
+    a.Permute<2 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 4); \
+    a.Permute<3 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 6); \
+    output[8] = _mm256_movemask_epi8(_mm256_cmpeq_epi8(*((__m256i*)output), _mm256_setzero_si256()));
+
+__forceinline void WriteToOutput(const MD5Vector a, __int32 * output, __m256i * expected)
+{
+    a.Permute<0 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
+    a.Permute<1 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
+    a.Permute<2 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
+    a.Permute<3 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
+    output[8] = _mm256_movemask_epi8(_mm256_cmpeq_epi8(*((__m256i*)output), _mm256_setzero_si256()));
+}
+
+const MD5Vector Ones = CREATE_VECTOR(0xffffffff);
+__forceinline const MD5Vector OP_NEG(const MD5Vector a) { return OP_ANDNOT(a, Ones); }
+
+__forceinline const MD5Vector Blend(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_BLEND(a, b, x); }
+__forceinline const MD5Vector Xor(const MD5Vector a, const MD5Vector b, const MD5Vector c) { return OP_XOR(a, OP_XOR(b, c)); }
+__forceinline const MD5Vector I(const MD5Vector a, const MD5Vector b, const MD5Vector c) { return OP_XOR(a, OP_OR(b, OP_NEG(c))); }
+
+template<int r>
+__forceinline const MD5Vector StepOuter(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_ADD(b, OP_ROT<r>(x)); }
+
+template<int r, unsigned __int32 k>
+__forceinline const MD5Vector Step1(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
+    return StepOuter<r>(a, b, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
+}
+
+template<int r, unsigned __int32 k>
+__forceinline const MD5Vector Step1(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
+    return StepOuter<r>(a, b, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), a)));
+}
+
+template<int r, unsigned __int32 k>
+__forceinline const MD5Vector Step2(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
+    return StepOuter<r>(a, c, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
+}
+
+template<int r, unsigned __int32 k>
+__forceinline const MD5Vector Step2(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
+    return StepOuter<r>(a, c, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), a)));
+}
+
+template<int r, unsigned __int32 k>
+__forceinline const MD5Vector Step3(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
+    return StepOuter<r>(a, b, OP_ADD(Xor(b, c, d), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
+}
+
+template<int r, unsigned __int32 k>
+__forceinline const MD5Vector Step3(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
+    return StepOuter<r>(a, b, OP_ADD(Xor(b, c, d), OP_ADD(CREATE_VECTOR(k), a)));
+}
+
+template<int r, unsigned __int32 k>
+__forceinline const MD5Vector Step4(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
+    return StepOuter<r>(a, b, OP_ADD(I(c, b, d), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
+}
+
+template<int r, unsigned __int32 k>
+__forceinline const MD5Vector Step4(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
+    return StepOuter<r>(a, b, OP_ADD(I(c, b, d), OP_ADD(CREATE_VECTOR(k), a)));
+}
+
+void md5(unsigned __int32 * input, unsigned __int32 * expected)
+{
+    MD5Vector a = CREATE_VECTOR(0x67452301);
+    MD5Vector b = CREATE_VECTOR(0xefcdab89);
+    MD5Vector c = CREATE_VECTOR(0x98badcfe);
+    MD5Vector d = CREATE_VECTOR(0x10325476);
+
+    MD5Vector inputVector0 = CREATE_VECTOR_FROM_INPUT(input, 0);
+    MD5Vector inputVector1 = CREATE_VECTOR_FROM_INPUT(input, 1);
+    MD5Vector inputVector2 = CREATE_VECTOR_FROM_INPUT(input, 2);
+    MD5Vector inputVector3 = CREATE_VECTOR_FROM_INPUT(input, 3);
+    MD5Vector inputVector4 = CREATE_VECTOR_FROM_INPUT(input, 4);
+    MD5Vector inputVector5 = CREATE_VECTOR_FROM_INPUT(input, 5);
+    MD5Vector inputVector6 = CREATE_VECTOR_FROM_INPUT(input, 6);
+    MD5Vector inputVector7 = CREATE_VECTOR_FROM_INPUT(input, 7);
+
+    a = Step1< 7, 0xd76aa478>(a, b, c, d, inputVector0);
+    d = Step1<12, 0xe8c7b756>(d, a, b, c, inputVector1);
+    c = Step1<17, 0x242070db>(c, d, a, b, inputVector2);
+    b = Step1<22, 0xc1bdceee>(b, c, d, a, inputVector3);
+    a = Step1< 7, 0xf57c0faf>(a, b, c, d, inputVector4);
+    d = Step1<12, 0x4787c62a>(d, a, b, c, inputVector5);
+    c = Step1<17, 0xa8304613>(c, d, a, b, inputVector6);
+    b = Step1<22, 0xfd469501>(b, c, d, a);
+    a = Step1< 7, 0x698098d8>(a, b, c, d);
+    d = Step1<12, 0x8b44f7af>(d, a, b, c);
+    c = Step1<17, 0xffff5bb1>(c, d, a, b);
+    b = Step1<22, 0x895cd7be>(b, c, d, a);
+    a = Step1< 7, 0x6b901122>(a, b, c, d);
+    d = Step1<12, 0xfd987193>(d, a, b, c);
+    c = Step1<17, 0xa679438e>(c, d, a, b, inputVector7);
+    b = Step1<22, 0x49b40821>(b, c, d, a);
+
+    a = Step2< 5, 0xf61e2562>(a, d, b, c, inputVector1);
+    d = Step2< 9, 0xc040b340>(d, c, a, b, inputVector6);
+    c = Step2<14, 0x265e5a51>(c, b, d, a);
+    b = Step2<20, 0xe9b6c7aa>(b, a, c, d, inputVector0);
+    a = Step2< 5, 0xd62f105d>(a, d, b, c, inputVector5);
+    d = Step2< 9, 0x02441453>(d, c, a, b);
+    c = Step2<14, 0xd8a1e681>(c, b, d, a);
+    b = Step2<20, 0xe7d3fbc8>(b, a, c, d, inputVector4);
+    a = Step2< 5, 0x21e1cde6>(a, d, b, c);
+    d = Step2< 9, 0xc33707d6>(d, c, a, b, inputVector7);
+    c = Step2<14, 0xf4d50d87>(c, b, d, a, inputVector3);
+    b = Step2<20, 0x455a14ed>(b, a, c, d);
+    a = Step2< 5, 0xa9e3e905>(a, d, b, c);
+    d = Step2< 9, 0xfcefa3f8>(d, c, a, b, inputVector2);
+    c = Step2<14, 0x676f02d9>(c, b, d, a);
+    b = Step2<20, 0x8d2a4c8a>(b, a, c, d);
+
+    a = Step3< 4, 0xfffa3942>(a, b, c, d, inputVector5);
+    d = Step3<11, 0x8771f681>(d, a, b, c);
+    c = Step3<16, 0x6d9d6122>(c, d, a, b);
+    b = Step3<23, 0xfde5380c>(b, c, d, a, inputVector7);
+    a = Step3< 4, 0xa4beea44>(a, b, c, d, inputVector1);
+    d = Step3<11, 0x4bdecfa9>(d, a, b, c, inputVector4);
+    c = Step3<16, 0xf6bb4b60>(c, d, a, b);
+    b = Step3<23, 0xbebfbc70>(b, c, d, a);
+    a = Step3< 4, 0x289b7ec6>(a, b, c, d);
+    d = Step3<11, 0xeaa127fa>(d, a, b, c, inputVector0);
+    c = Step3<16, 0xd4ef3085>(c, d, a, b, inputVector3);
+    b = Step3<23, 0x04881d05>(b, c, d, a, inputVector6);
+    a = Step3< 4, 0xd9d4d039>(a, b, c, d);
+    d = Step3<11, 0xe6db99e5>(d, a, b, c);
+    c = Step3<16, 0x1fa27cf8>(c, d, a, b);
+    b = Step3<23, 0xc4ac5665>(b, c, d, a, inputVector2);
+
+    a = Step4< 6, 0xf4292244>(a, b, c, d, inputVector0);
+    d = Step4<10, 0x432aff97>(d, a, b, c);
+    c = Step4<15, 0xab9423a7>(c, d, a, b, inputVector7);
+    b = Step4<21, 0xfc93a039>(b, c, d, a, inputVector5);
+    a = Step4< 6, 0x655b59c3>(a, b, c, d);
+    d = Step4<10, 0x8f0ccc92>(d, a, b, c, inputVector3);
+    c = Step4<15, 0xffeff47d>(c, d, a, b);
+    b = Step4<21, 0x85845dd1>(b, c, d, a, inputVector1);
+    a = Step4< 6, 0x6fa87e4f>(a, b, c, d);
+    d = Step4<10, 0xfe2ce6e0>(d, a, b, c);
+    c = Step4<15, 0xa3014314>(c, d, a, b, inputVector6);
+    b = Step4<21, 0x4e0811a1>(b, c, d, a);
+    a = Step4< 6, 0xf7537e82>(a, b, c, d, inputVector4);
+
+    a = OP_ADD(CREATE_VECTOR(0x67452301), a);
+
+    WRITE_TO_OUTPUT(a, ((__int32*)input), ((__m256i*)expected));
+}
+#pragma managed
--- a/dotnet/WhiteRabbit.UnmanagedBridge/md5.h
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/md5.h
@ -0,0 +1,3 @@
+#pragma once
+
+void md5(unsigned int* input, unsigned __int32 * expected);
--- a/dotnet/WhiteRabbit.UnmanagedBridge/phraseset.cpp
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/phraseset.cpp
@ -0,0 +1,86 @@
+#include "stdafx.h"
+#include "phraseset.h"
+#include "constants.h"
+
+#include "intrin.h"
+
+#pragma unmanaged
+
+template<int numberOfWords>
+class Processor
+{
+public:
+    template<int wordNumber>
+    static __forceinline const __m256i ProcessWord(const __m256i phrase, const unsigned __int64 cumulativeWordOffset, const unsigned __int64 permutation, unsigned __int64* allWordsPointer, __int32* wordIndexes)
+    {
+        auto currentWord = allWordsPointer + wordIndexes[_bextr_u64(permutation, 4 * wordNumber, 4)] * 128;
+
+        return ProcessWord<wordNumber + 1>(
+            _mm256_xor_si256(phrase, *(__m256i*)(currentWord + cumulativeWordOffset)),
+            cumulativeWordOffset + currentWord[127],
+            permutation,
+            allWordsPointer,
+            wordIndexes);
+    }
+
+    template<>
+    static __forceinline const __m256i ProcessWord<numberOfWords>(const __m256i phrase, const unsigned __int64 cumulativeWordOffset, const unsigned __int64 permutation, unsigned __int64* allWordsPointer, __int32* wordIndexes)
+    {
+        return phrase;
+    }
+
+    template<int phraseNumber>
+    static __forceinline void ProcessWordsForPhrase(__m256i* avx2initialBuffer, __m256i* avx2buffer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer)
+    {
+        avx2buffer[phraseNumber] = ProcessWord<0>(*avx2initialBuffer, 0, permutationsPointer[phraseNumber], allWordsPointer, wordIndexes);
+        ProcessWordsForPhrase<phraseNumber + 1>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+    }
+
+    template<>
+    static __forceinline void ProcessWordsForPhrase<PHRASES_PER_SET>(__m256i* avx2initialBuffer, __m256i* avx2buffer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer)
+    {
+        return;
+    }
+};
+
+void fillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords)
+{
+    auto avx2initialBuffer = (__m256i*)initialBufferPointer;
+    auto avx2buffer = (__m256i*)bufferPointer;
+
+    switch (numberOfWords)
+    {
+    case 1:
+        Processor<1>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 2:
+        Processor<2>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 3:
+        Processor<3>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 4:
+        Processor<4>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 5:
+        Processor<5>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 6:
+        Processor<6>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 7:
+        Processor<7>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 8:
+        Processor<8>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 9:
+        Processor<9>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    case 10:
+        Processor<10>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
+        break;
+    }
+}
+
+#pragma managed
--- a/dotnet/WhiteRabbit.UnmanagedBridge/phraseset.h
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/phraseset.h
@ -0,0 +1,3 @@
+#pragma once
+
+void fillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords);
--- a/dotnet/WhiteRabbit.UnmanagedBridge/resource.h
+++ b/dotnet/WhiteRabbit.UnmanagedBridge/resource.h
@ -0,0 +1,3 @@
+//{{NO_DEPENDENCIES}}
+// Microsoft Visual C++ generated include file.
+// Used by app.rc
--- a/dotnet/WhiteRabbit/App.config
+++ b/dotnet/WhiteRabbit/App.config
@ -0,0 +1,11 @@
+<?xml version="1.0" encoding="utf-8"?>
+<configuration>
+  <startup> 
+    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.7"/>
+  </startup>
+  <appSettings>
+    <add key="SourcePhrase" value="poultry outwits ants"/>
+    <add key="MaxWordsInPhrase" value="5"/>
+    <add key="ExpectedHashes" value="e4820b45d2277f3844eac66c903e84be,23170acc097c24edb98fc5488ab033fe,665e5bcb0c20062fe8abaaf4628bb154,e8a2cbb6206fc937082bb92e4ed9cd3d,74a613b8c64fb216dc22d4f2bd4965f4,ccb5ed231ba04d750c963668391d1e61,d864ae0e66c89cb78345967cb2f3ab6b,2b56477105d91076030e877c94dd9776,732442feac8b5013e16a776486ac5447"/>
+  </appSettings>
+</configuration>
--- a/dotnet/WhiteRabbit/ByteArrayEqualityComparer.cs
+++ b/dotnet/WhiteRabbit/ByteArrayEqualityComparer.cs
@ -0,0 +1,39 @@
+namespace WhiteRabbit
+{
+    using System.Collections.Generic;
+    using System.Linq;
+
+    internal class ByteArrayEqualityComparer : IEqualityComparer<byte[]>
+    {
+        public bool Equals(byte[] x, byte[] y)
+        {
+            if (object.ReferenceEquals(x, y))
+            {
+                return true;
+            }
+
+            if (x?.Length != y?.Length)
+            {
+                return false;
+            }
+
+            return Enumerable.Range(0, x.Length).All(i => x[i] == y[i]);
+        }
+
+        public int GetHashCode(byte[] obj)
+        {
+            if (obj == null)
+            {
+                return 0;
+            }
+
+            int result = 0;
+            for (var i = 0; i < obj.Length; i++)
+            {
+                result = unchecked(result + (i * obj[i]));
+            }
+
+            return result;
+        }
+    }
+}
--- a/dotnet/WhiteRabbit/Constants.cs
+++ b/dotnet/WhiteRabbit/Constants.cs
@ -0,0 +1,11 @@
+namespace WhiteRabbit
+{
+    internal class Constants
+    {
+        public const int PhrasesPerSet = WhiteRabbitUnmanagedBridge.MD5Unmanaged.PhrasesPerSet;
+
+        public const int MaxNumberOfWords = 8;
+
+        public const int NumberOfThreads = 4;
+    }
+}
--- a/dotnet/WhiteRabbit/Flattener.cs
+++ b/dotnet/WhiteRabbit/Flattener.cs
@ -0,0 +1,198 @@
+namespace WhiteRabbit
+{
+    using System;
+    using System.Collections.Generic;
+    using System.Collections.Immutable;
+    using System.Linq;
+
+    /// <summary>
+    /// Converts e.g. pair of variants [[a, b, c], [d, e]] into all possible pairs: [[a, d], [a, e], [b, d], [b, e], [c, d], [c, e]]
+    /// </summary>
+    internal static class Flattener
+    {
+        private static IEnumerable<T[]> Flatten3<T>(T[][] phrase)
+        {
+            foreach (var item0 in phrase[0])
+            foreach (var item1 in phrase[1])
+            foreach (var item2 in phrase[2])
+                yield return new T[]
+                {
+                    item0,
+                    item1,
+                    item2,
+                };
+        }
+
+        private static IEnumerable<T[]> Flatten4<T>(T[][] phrase)
+        {
+            foreach (var item0 in phrase[0])
+            foreach (var item1 in phrase[1])
+            foreach (var item2 in phrase[2])
+            foreach (var item3 in phrase[3])
+                yield return new T[]
+                {
+                    item0,
+                    item1,
+                    item2,
+                    item3,
+                };
+        }
+
+        private static IEnumerable<T[]> Flatten5<T>(T[][] phrase)
+        {
+            foreach (var item0 in phrase[0])
+            foreach (var item1 in phrase[1])
+            foreach (var item2 in phrase[2])
+            foreach (var item3 in phrase[3])
+            foreach (var item4 in phrase[4])
+                yield return new T[]
+                {
+                    item0,
+                    item1,
+                    item2,
+                    item3,
+                    item4,
+                };
+        }
+
+        private static IEnumerable<T[]> Flatten6<T>(T[][] phrase)
+        {
+            foreach (var item0 in phrase[0])
+            foreach (var item1 in phrase[1])
+            foreach (var item2 in phrase[2])
+            foreach (var item3 in phrase[3])
+            foreach (var item4 in phrase[4])
+            foreach (var item5 in phrase[5])
+                yield return new T[]
+                {
+                    item0,
+                    item1,
+                    item2,
+                    item3,
+                    item4,
+                    item5,
+                };
+        }
+
+        private static IEnumerable<T[]> Flatten7<T>(T[][] phrase)
+        {
+            foreach (var item0 in phrase[0])
+            foreach (var item1 in phrase[1])
+            foreach (var item2 in phrase[2])
+            foreach (var item3 in phrase[3])
+            foreach (var item4 in phrase[4])
+            foreach (var item5 in phrase[5])
+            foreach (var item6 in phrase[6])
+                yield return new T[]
+                {
+                    item0,
+                    item1,
+                    item2,
+                    item3,
+                    item4,
+                    item5,
+                    item6,
+                };
+        }
+
+        private static IEnumerable<T[]> Flatten8<T>(T[][] phrase)
+        {
+            foreach (var item0 in phrase[0])
+            foreach (var item1 in phrase[1])
+            foreach (var item2 in phrase[2])
+            foreach (var item3 in phrase[3])
+            foreach (var item4 in phrase[4])
+            foreach (var item5 in phrase[5])
+            foreach (var item6 in phrase[6])
+            foreach (var item7 in phrase[7])
+                yield return new T[]
+                {
+                    item0,
+                    item1,
+                    item2,
+                    item3,
+                    item4,
+                    item5,
+                    item6,
+                    item7,
+                };
+        }
+
+        private static IEnumerable<T[]> Flatten9<T>(T[][] phrase)
+        {
+            foreach (var item0 in phrase[0])
+            foreach (var item1 in phrase[1])
+            foreach (var item2 in phrase[2])
+            foreach (var item3 in phrase[3])
+            foreach (var item4 in phrase[4])
+            foreach (var item5 in phrase[5])
+            foreach (var item6 in phrase[6])
+            foreach (var item7 in phrase[7])
+            foreach (var item8 in phrase[8])
+                yield return new T[]
+                {
+                    item0,
+                    item1,
+                    item2,
+                    item3,
+                    item4,
+                    item5,
+                    item6,
+                    item7,
+                    item8,
+                };
+        }
+
+        private static IEnumerable<T[]> Flatten10<T>(T[][] phrase)
+        {
+            foreach (var item0 in phrase[0])
+            foreach (var item1 in phrase[1])
+            foreach (var item2 in phrase[2])
+            foreach (var item3 in phrase[3])
+            foreach (var item4 in phrase[4])
+            foreach (var item5 in phrase[5])
+            foreach (var item6 in phrase[6])
+            foreach (var item7 in phrase[7])
+            foreach (var item8 in phrase[8])
+            foreach (var item9 in phrase[9])
+                yield return new T[]
+                {
+                    item0,
+                    item1,
+                    item2,
+                    item3,
+                    item4,
+                    item5,
+                    item6,
+                    item7,
+                    item8,
+                    item9,
+                };
+        }
+
+        public static IEnumerable<T[]> Flatten<T>(T[][] wordVariants)
+        {
+            switch (wordVariants.Length)
+            {
+                case 3:
+                    return Flatten3(wordVariants);
+                case 4:
+                    return Flatten4(wordVariants);
+                case 5:
+                    return Flatten5(wordVariants);
+                case 6:
+                    return Flatten6(wordVariants);
+                case 7:
+                    return Flatten7(wordVariants);
+                case 8:
+                    return Flatten8(wordVariants);
+                case 9:
+                    return Flatten9(wordVariants);
+                case 10:
+                    return Flatten10(wordVariants);
+                default:
+                    throw new ArgumentOutOfRangeException(nameof(wordVariants));
+            }
+        }
+    }
+}
--- a/dotnet/WhiteRabbit/PermutationsGenerator.cs
+++ b/dotnet/WhiteRabbit/PermutationsGenerator.cs
@ -8,7 +8,7 @@
    /// <summary>
    /// Code taken from https://ericlippert.com/2013/04/22/producing-permutations-part-three/
    /// </summary>
-    internal class PermutationsGenerator
+    internal sealed class PermutationsGenerator
    {
        public static IEnumerable<Permutation> HamiltonianPermutations(int n)
        {
@ -34,9 +34,7 @@

            public static Permutation Empty { get; } = new Permutation(new int[] { });

-            private int[] PermutationData { get; }
-
-            public int this[int index] => this.PermutationData[index];
+            public int[] PermutationData { get; }

            public static IEnumerable<Permutation> HamiltonianPermutationsIterator(int n)
            {
--- a/dotnet/WhiteRabbit/PhraseSet.cs
+++ b/dotnet/WhiteRabbit/PhraseSet.cs
@ -0,0 +1,126 @@
+namespace WhiteRabbit
+{
+    using System;
+    using System.Diagnostics;
+    using System.Linq;
+    using System.Numerics;
+    using System.Runtime.CompilerServices;
+    using WhiteRabbitUnmanagedBridge;
+
+    // Anagram representation optimized for MD5
+    internal struct PhraseSet
+    {
+        private uint[] Buffer;
+
+        public void Init()
+        {
+            this.Buffer = new uint[8 * Constants.PhrasesPerSet];
+        }
+
+        public unsafe void FillLength(int numberOfCharacters, int numberOfWords)
+        {
+            fixed (uint* bufferPointer = this.Buffer)
+            {
+                var length = (uint)(numberOfCharacters + numberOfWords - 1);
+                var lengthInBits = (uint)(length << 3);
+                for (var i = 0; i < Constants.PhrasesPerSet; i++)
+                {
+                    bufferPointer[7 + i * 8] = lengthInBits;
+                    ((byte*)bufferPointer)[length + i * 32] = 128 ^ ' ';
+                }
+            }
+        }
+
+        public unsafe void ProcessPermutations(PhraseSet initialPhraseSet, Word[] allWords, int[] wordIndexes, ulong[] permutations, uint[] expectedHashesVector, Action<byte[], uint> action)
+        {
+            fixed (uint* bufferPointer = this.Buffer, initialBufferPointer = initialPhraseSet.Buffer)
+            {
+                fixed (ulong* permutationsPointer = permutations)
+                {
+                    fixed (int* wordIndexesPointer = wordIndexes)
+                    {
+                        fixed (Word* allWordsPointer = allWords)
+                        {
+                            fixed (uint* expectedHashesPointer = expectedHashesVector)
+                            {
+                                for (var i = 0; i < permutations.Length; i += Constants.PhrasesPerSet)
+                                {
+                                    MD5Unmanaged.FillPhraseSet(
+                                        (ulong*)initialBufferPointer,
+                                        (ulong*)bufferPointer,
+                                        (ulong*)allWordsPointer,
+                                        wordIndexesPointer,
+                                        permutationsPointer + i,
+                                        wordIndexes.Length);
+
+                                    MD5Unmanaged.ComputeMD5(bufferPointer, expectedHashesPointer);
+
+                                    if (bufferPointer[Constants.PhrasesPerSet / 2] != 0xFFFFFFFF)
+                                    {
+                                        for (var j = 0; j < Constants.PhrasesPerSet; j++)
+                                        {
+                                            // 16 matches are packed in 8 32-bit numbers: [0,1], [8,9], [2,3], [10,11], [4, 5], [12, 13], [6, 7], [14, 15]
+                                            var position = ((j / 2) % 4) * 2 + (j / 8);
+                                            var match = (bufferPointer[position] >> (4 * (j % 2))) & 0xF0F0F0F;
+                                            if (match != 0)
+                                            {
+                                                var bufferInfo = ((ulong)bufferPointer[Constants.PhrasesPerSet] << 32) | bufferPointer[j];
+                                                MD5Unmanaged.FillPhraseSet(
+                                                    (ulong*)initialBufferPointer,
+                                                    (ulong*)bufferPointer,
+                                                    (ulong*)allWordsPointer,
+                                                    wordIndexesPointer,
+                                                    permutationsPointer + i,
+                                                    wordIndexes.Length);
+                                                action(this.GetBytes(j), match);
+                                                break;
+                                            }
+                                        }
+                                    }
+                                }
+                            }
+                        }
+                    }
+                }
+            }
+        }
+
+        public unsafe byte[] GetBytes(int number)
+        {
+            Debug.Assert(number < Constants.PhrasesPerSet);
+
+            fixed (uint* bufferPointer = this.Buffer)
+            {
+                var phrasePointer = bufferPointer + 8 * number;
+                var length = 0;
+                for (var i = 27; i >= 0; i--)
+                {
+                    if (((byte*)phrasePointer)[i] == 128)
+                    {
+                        length = i;
+                        break;
+                    }
+                }
+
+                var result = new byte[length];
+                for (var i = 0; i < length; i++)
+                {
+                    result[i] = ((byte*)phrasePointer)[i];
+                }
+
+                return result;
+            }
+        }
+
+        public unsafe string DebugBytes(int number)
+        {
+            Debug.Assert(number < Constants.PhrasesPerSet);
+
+            fixed (uint* bufferPointer = this.Buffer)
+            {
+                var bytes = (byte*)bufferPointer;
+                return string.Concat(Enumerable.Range(32 * number, 32).Select(i => bytes[i].ToString("X2")));
+            }
+        }
+    }
+}
--- a/dotnet/WhiteRabbit/PrecomputedPermutationsGenerator.cs
+++ b/dotnet/WhiteRabbit/PrecomputedPermutationsGenerator.cs
@ -0,0 +1,157 @@
+namespace WhiteRabbit
+{
+    using System;
+    using System.Collections.Generic;
+    using System.Linq;
+
+    internal static class PrecomputedPermutationsGenerator
+    {
+        static PrecomputedPermutationsGenerator()
+        {
+            Permutations = new ulong[Constants.MaxNumberOfWords + 1][][];
+            PermutationsNumbers = new long[Constants.MaxNumberOfWords + 1][];
+            for (var i = 0; i <= Constants.MaxNumberOfWords; i++)
+            {
+                var permutationsInfo = GeneratePermutations(i);
+                Permutations[i] = permutationsInfo.Item1;
+                PermutationsNumbers[i] = permutationsInfo.Item2;
+            }
+        }
+
+        private static ulong[][][] Permutations { get; }
+
+        private static long[][] PermutationsNumbers { get; }
+
+        public static ulong[] HamiltonianPermutations(int n, uint filter) => Permutations[n][filter];
+
+        public static long GetPermutationsNumber(int n, uint filter) => PermutationsNumbers[n][filter];
+
+        private static Tuple<ulong[][], long[]> GeneratePermutations(int n)
+        {
+            if (n == 0)
+            {
+                return Tuple.Create(new ulong[0][], new long[0]);
+            }
+
+            var allPermutations = PermutationsGenerator.HamiltonianPermutations(n)
+                .Select(FormatPermutation)
+                .ToArray();
+
+            var statesCount = (uint)1 << (n - 1);
+            var resultUnpadded = new PermutationInfo[statesCount][];
+
+            resultUnpadded[0] = allPermutations;
+            for (uint i = 1; i < statesCount; i++)
+            {
+                var mask = i;
+                mask |= mask >> 1;
+                mask |= mask >> 2;
+                mask |= mask >> 4;
+                mask |= mask >> 8;
+                mask |= mask >> 16;
+                mask = mask >> 1;
+                var existing = i & mask;
+                var seniorBit = i ^ existing;
+                var position = 0;
+                while (seniorBit != 0)
+                {
+                    seniorBit = seniorBit >> 1;
+                    position++;
+                }
+
+                resultUnpadded[i] = resultUnpadded[existing]
+                    .Where(info => ((info.PermutationInverse >> (4 * (position - 1))) % 16 < (info.PermutationInverse >> (4 * position)) % 16))
+                    .ToArray();
+            }
+
+            var result = new ulong[statesCount][];
+            var numbers = new long[statesCount];
+            for (uint i = 0; i < statesCount; i++)
+            {
+                result[i] = PadToWholeChunks(resultUnpadded[i], Constants.PhrasesPerSet);
+                numbers[i] = resultUnpadded[i].LongLength;
+            }
+
+            return Tuple.Create(result, numbers);
+        }
+
+        public static bool IsOrderPreserved(ulong permutation, uint position)
+        {
+            var currentPermutation = permutation;
+
+            while (currentPermutation != 0)
+            {
+                if ((currentPermutation & 15) == position)
+                {
+                    return true;
+                }
+
+                if ((currentPermutation & 15) == (position + 1))
+                {
+                    return false;
+                }
+
+                currentPermutation = currentPermutation >> 4;
+            }
+
+            throw new ApplicationException("Malformed permutation " + permutation + " for position " + position);
+        }
+
+        private static ulong[] PadToWholeChunks(PermutationInfo[] original, int chunkSize)
+        {
+            ulong[] result;
+            if (original.Length % chunkSize == 0)
+            {
+                result = new ulong[original.Length];
+            }
+            else
+            {
+                result = new ulong[original.Length + chunkSize - (original.Length % chunkSize)];
+            }
+
+            for (var i = 0; i < original.Length; i++)
+            {
+                result[i] = original[i].Permutation;
+            }
+
+            return result;
+        }
+
+        private static PermutationInfo FormatPermutation(PermutationsGenerator.Permutation permutation)
+        {
+            System.Diagnostics.Debug.Assert(permutation.PermutationData.Length <= 16);
+
+            ulong result = 0;
+            ulong resultInverse = 0;
+            for (var i = 0; i < permutation.PermutationData.Length; i++)
+            {
+                var source = i;
+                var target = permutation.PermutationData[i];
+                result |= (ulong)(target) << (4 * source);
+                resultInverse |= (ulong)(source) << (4 * target);
+            }
+
+            return new PermutationInfo { Permutation = result, PermutationInverse = resultInverse };
+        }
+
+        private static IEnumerable<long> GeneratePermutationsNumbers()
+        {
+            long result = 1;
+            yield return result;
+
+            var i = 1;
+            while (true)
+            {
+                result *= i;
+                yield return result;
+                i++;
+            }
+        }
+
+        private struct PermutationInfo
+        {
+            public ulong Permutation;
+            public ulong PermutationInverse;
+        }
+    }
+}
--- a/dotnet/WhiteRabbit/Program.cs
+++ b/dotnet/WhiteRabbit/Program.cs
@ -0,0 +1,122 @@
+namespace WhiteRabbit
+{
+    using System;
+    using System.Collections.Generic;
+    using System.Configuration;
+    using System.Diagnostics;
+    using System.IO;
+    using System.Linq;
+    using System.Numerics;
+    using System.Security.Cryptography;
+    using System.Text;
+
+    /// <summary>
+    /// Main class
+    /// </summary>
+    public static class Program
+    {
+        /// <summary>
+        /// Main entry point
+        /// </summary>
+        public static void Main()
+        {
+            var stopwatch = new Stopwatch();
+            stopwatch.Start();
+
+            var sourcePhrase = ConfigurationManager.AppSettings["SourcePhrase"];
+
+            var maxWordsInPhrase = int.Parse(ConfigurationManager.AppSettings["MaxWordsInPhrase"]);
+
+            if (sourcePhrase.Where(ch => ch != ' ').Count() + maxWordsInPhrase > 28)
+            {
+                Console.WriteLine("Only anagrams of up to 27 characters (including whitespace) are allowed");
+                return;
+            }
+
+            if (maxWordsInPhrase > Constants.MaxNumberOfWords)
+            {
+                Console.WriteLine($"Only anagrams of up to {Constants.MaxNumberOfWords} words are allowed");
+                return;
+            }
+
+            if (!BitConverter.IsLittleEndian)
+            {
+                Console.WriteLine("Only little-endian systems are supported due to MD5Digest optimizations");
+                return;
+            }
+
+            if (IntPtr.Size != 8)
+            {
+                Console.WriteLine("Only 64-bit systems are supported due to MD5Digest optimizations");
+            }
+
+            var expectedHashesFirstComponentsArray = new uint[8];
+            {
+                int i = 0;
+                foreach (var expectedHash in ConfigurationManager.AppSettings["ExpectedHashes"].Split(','))
+                {
+                    expectedHashesFirstComponentsArray[i] = HexadecimalStringToUnsignedIntArray(expectedHash)[0];
+                    expectedHashesFirstComponentsArray[i + 1] = HexadecimalStringToUnsignedIntArray(expectedHash)[0];
+                    i += 2;
+                }
+            }
+
+            var processor = new StringsProcessor(
+                Encoding.ASCII.GetBytes(sourcePhrase),
+                maxWordsInPhrase,
+                ReadInput());
+
+            Console.WriteLine($"Initialization complete; time from start: {stopwatch.Elapsed}");
+
+#if DEBUG
+            var fastPhrasesCount = processor.GetPhrasesCount();
+            Console.WriteLine($"Number of phrases: {fastPhrasesCount}; time from start: {stopwatch.Elapsed}");
+#endif
+
+            stopwatch.Restart();
+
+            processor.CheckPhrases(expectedHashesFirstComponentsArray, (phraseBytes, hashFirstComponent) =>
+            {
+                var phrase = Encoding.ASCII.GetString(phraseBytes);
+                var hash = ComputeFullMD5(phraseBytes);
+                Console.WriteLine($"Found phrase for {hash} ({hashFirstComponent:x8}): {phrase}; time from start is {stopwatch.Elapsed}");
+            });
+
+            Console.WriteLine($"Done; time from start: {stopwatch.Elapsed}");
+        }
+
+        // Code taken from http://stackoverflow.com/a/321404/831314
+        private static uint[] HexadecimalStringToUnsignedIntArray(string hex)
+        {
+            return Enumerable.Range(0, hex.Length)
+                             .Where(x => x % 8 == 0)
+                             .Select(x => ChangeEndianness(hex.Substring(x, 8)))
+                             .Select(hexLe => Convert.ToUInt32(hexLe, 16))
+                             .ToArray();
+        }
+
+        // We can afford to spend some time here; this code will only run for matched phrases (and for one in several billion non-matched)
+        private static string ComputeFullMD5(byte[] phraseBytes)
+        {
+            using (var hashAlgorithm = new MD5CryptoServiceProvider())
+            {
+                var resultBytes = hashAlgorithm.ComputeHash(phraseBytes);
+                return string.Concat(resultBytes.Select(b => b.ToString("x2")));
+            }
+        }
+
+        private static string ChangeEndianness(string hex)
+        {
+            return hex.Substring(6, 2) + hex.Substring(4, 2) + hex.Substring(2, 2) + hex.Substring(0, 2);
+        }
+
+        private static IEnumerable<byte[]> ReadInput()
+        {
+            string line;
+            while ((line = Console.ReadLine()) != null)
+            {
+                yield return Encoding.ASCII.GetBytes(line);
+            }
+        }
+    }
+}
--- a/dotnet/WhiteRabbit/Properties/AssemblyInfo.cs
+++ b/dotnet/WhiteRabbit/Properties/AssemblyInfo.cs
--- a/dotnet/WhiteRabbit/StringsProcessor.cs
+++ b/dotnet/WhiteRabbit/StringsProcessor.cs
@ -0,0 +1,143 @@
+namespace WhiteRabbit
+{
+    using System;
+    using System.Collections.Generic;
+    using System.Linq;
+    using System.Numerics;
+    using System.Threading.Tasks;
+
+    internal sealed class StringsProcessor
+    {
+        private const byte SPACE = 32;
+
+        // Ensure that permutations are precomputed prior to main run, so that processing times will be correct
+        static StringsProcessor()
+        {
+            PrecomputedPermutationsGenerator.HamiltonianPermutations(1, 0);
+        }
+
+        public StringsProcessor(byte[] sourceString, int maxWordsCount, IEnumerable<byte[]> words)
+        {
+            var filteredSource = sourceString.Where(ch => ch != SPACE).ToArray();
+            this.NumberOfCharacters = filteredSource.Length;
+            this.VectorsConverter = new VectorsConverter(filteredSource);
+
+            var allWordsAndVectors = words
+                .Where(word => word != null && word.Length > 0)
+                .Select(word => new { word, vector = this.VectorsConverter.GetVector(word) })
+                .Where(tuple => tuple.vector != null)
+                .Select(tuple => tuple.word)
+                .Distinct(new ByteArrayEqualityComparer())
+                .Select(word => word)
+                .ToArray();
+
+            // Dictionary of vectors to array of words represented by this vector
+            var vectorsToWords = allWordsAndVectors
+                .Select((word, index) => new { word, index, vector = this.VectorsConverter.GetVector(word).Value })
+                .GroupBy(tuple => tuple.vector)
+                .Select(group => new { vector = group.Key, words = group.Select(tuple => tuple.index).ToArray() })
+                .ToList();
+
+            this.WordsDictionary = vectorsToWords.Select(tuple => tuple.words).ToArray();
+
+            this.AllWords = allWordsAndVectors.Select(word => new Word(word)).ToArray();
+
+            this.VectorsProcessor = new VectorsProcessor(
+                this.VectorsConverter.GetVector(filteredSource).Value,
+                maxWordsCount,
+                vectorsToWords.Select(tuple => tuple.vector).ToArray());
+        }
+
+        private VectorsConverter VectorsConverter { get; }
+
+        private Word[] AllWords { get; }
+
+        /// <summary>
+        /// WordsDictionary[vectorIndex] = [word1index, word2index, ...]
+        /// </summary>
+        private int[][] WordsDictionary { get; }
+
+        private VectorsProcessor VectorsProcessor { get; }
+
+        private int NumberOfCharacters { get; }
+
+        public void CheckPhrases(uint[] expectedHashesVector, Action<byte[], uint> action)
+        {
+            // task of finding anagrams could be reduced to the task of finding sequences of dictionary vectors with the target sum
+            var sums = this.VectorsProcessor.GenerateSequences();
+
+            // converting sequences of vectors to the sequences of words...
+            Parallel.ForEach(sums, new ParallelOptions { MaxDegreeOfParallelism = Constants.NumberOfThreads }, sum => ProcessSum(sum, expectedHashesVector, action));
+        }
+
+        public long GetPhrasesCount()
+        {
+            var sums = this.VectorsProcessor.GenerateSequences();
+            return (from sum in sums
+                    let filter = ComputeFilter(sum)
+                    let wordsVariantsNumber = this.ConvertVectorsToWordsNumber(sum)
+                    let permutationsNumber = PrecomputedPermutationsGenerator.GetPermutationsNumber(sum.Length, filter)
+                    let total = wordsVariantsNumber * permutationsNumber
+                    select total)
+                    .Sum();
+        }
+
+        private static uint ComputeFilter(int[] vectors)
+        {
+            uint result = 0;
+            for (var i = 1; i < vectors.Length; i++)
+            {
+                if (vectors[i] == vectors[i - 1])
+                {
+                    result |= (uint)1 << (i - 1);
+                }
+            }
+
+            return result;
+        }
+
+        private int[][] ConvertVectorsToWordIndexes(int[] vectors)
+        {
+            var length = vectors.Length;
+            var words = new int[length][];
+            for (var i = 0; i < length; i++)
+            {
+                words[i] = this.WordsDictionary[vectors[i]];
+            }
+
+            return words;
+        }
+
+        private long ConvertVectorsToWordsNumber(int[] vectors)
+        {
+            long result = 1;
+            for (var i = 0; i < vectors.Length; i++)
+            {
+                result *= this.WordsDictionary[vectors[i]].Length;
+            }
+
+            return result;
+        }
+
+        private void ProcessSum(int[] sum, uint[] expectedHashesVector, Action<byte[], uint> action)
+        {
+            var initialPhraseSet = new PhraseSet();
+            initialPhraseSet.Init();
+            initialPhraseSet.FillLength(this.NumberOfCharacters, sum.Length);
+            var phraseSet = new PhraseSet();
+            phraseSet.Init();
+            var permutationsFilter = ComputeFilter(sum);
+            var wordsVariants = this.ConvertVectorsToWordIndexes(sum);
+            foreach (var wordsArray in Flattener.Flatten(wordsVariants))
+            {
+                phraseSet.ProcessPermutations(
+                    initialPhraseSet,
+                    this.AllWords,
+                    wordsArray,
+                    PrecomputedPermutationsGenerator.HamiltonianPermutations(wordsArray.Length, permutationsFilter),
+                    expectedHashesVector,
+                    action);
+            }
+        }
+    }
+}
--- a/dotnet/WhiteRabbit/VectorsConverter.cs
+++ b/dotnet/WhiteRabbit/VectorsConverter.cs
@ -1,5 +1,6 @@
 namespace WhiteRabbit
 {
+    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Numerics;
@ -8,27 +9,33 @@
    /// Converts strings to vectors containing chars count, based on a source string.
    /// E.g. for source string "abc", string "a" is converted to [1, 0, 0], while string "bcb" is converted to [0, 2, 1].
    /// </summary>
-    internal class VectorsConverter
+    internal sealed class VectorsConverter
    {
-        public VectorsConverter(string sourceString)
+        public VectorsConverter(byte[] sourceString)
        {
            var rawNumberOfOccurrences = sourceString.GroupBy(ch => ch).ToDictionary(group => group.Key, group => group.Count());
            this.IntToChar = rawNumberOfOccurrences.OrderBy(kvp => kvp.Key).Select(kvp => kvp.Key).ToArray();
+
+            if (this.IntToChar.Length > Vector<byte>.Count)
+            {
+                throw new ArgumentException($"String should not contain more than {Vector<byte>.Count} different characters", nameof(sourceString));
+            }
+
            this.CharToInt = Enumerable.Range(0, this.IntToChar.Length).ToDictionary(i => this.IntToChar[i], i => i);
        }

-        private Dictionary<char, int> CharToInt { get; }
+        private Dictionary<byte, int> CharToInt { get; }

-        private char[] IntToChar { get; }
+        private byte[] IntToChar { get; }

-        public Vector<byte>? GetVector(string word)
+        public Vector<byte>? GetVector(byte[] word)
        {
            if (word.Any(ch => !this.CharToInt.ContainsKey(ch)))
            {
                return null;
            }

-            var arr = new byte[16];
+            var arr = new byte[Vector<byte>.Count];
            foreach (var ch in word)
            {
                arr[this.CharToInt[ch]]++;
@ -36,10 +43,5 @@

            return new Vector<byte>(arr);
        }
-
-        public string GetString(Vector<byte> vector)
-        {
-            return new string(Enumerable.Range(0, this.IntToChar.Length).SelectMany(i => Enumerable.Repeat(this.IntToChar[i], (int)vector[i])).ToArray());
-        }
    }
 }
--- a/dotnet/WhiteRabbit/VectorsProcessor.cs
+++ b/dotnet/WhiteRabbit/VectorsProcessor.cs
@ -0,0 +1,152 @@
+namespace WhiteRabbit
+{
+    using System;
+    using System.Collections.Generic;
+    using System.Collections.Immutable;
+    using System.Linq;
+    using System.Numerics;
+
+    internal sealed class VectorsProcessor
+    {
+        private const byte MaxComponentValue = 8;
+        private const int LeastCommonMultiple = 840;
+
+        public VectorsProcessor(Vector<byte> target, int maxVectorsCount, Vector<byte>[] dictionary)
+        {
+            if (Enumerable.Range(0, Vector<byte>.Count).Any(i => target[i] > MaxComponentValue))
+            {
+                throw new ArgumentException($"Every value should be at most {MaxComponentValue} (at most {MaxComponentValue} same characters allowed in the source string)", nameof(target));
+            }
+
+            this.Target = target;
+
+            this.MaxVectorsCount = maxVectorsCount;
+            this.Dictionary = ImmutableArray.Create(FilterVectors(dictionary, target).ToArray());
+
+            var normsIndex = new int[GetVectorNorm(target, target) + 1];
+            var offset = 0;
+            for (var i = normsIndex.Length - 1; i >= 0; i--)
+            {
+                while (offset < this.Dictionary.Length && this.Dictionary[offset].Norm > i)
+                {
+                    offset++;
+                }
+
+                normsIndex[i] = offset;
+            }
+
+            this.NormsIndex = ImmutableArray.Create(normsIndex);
+        }
+
+        private Vector<byte> Target { get; }
+
+        private int MaxVectorsCount { get; }
+
+        private ImmutableArray<VectorInfo> Dictionary { get; }
+
+        // Stores index of the first vector from Dictionary with norm less than or equal to offset
+        private ImmutableArray<int> NormsIndex { get; }
+
+        // Produces all sets of vectors with the target sum
+        public IEnumerable<int[]> GenerateSequences()
+        {
+            return this.GenerateUnorderedSequences(this.Target, GetVectorNorm(this.Target, this.Target), this.MaxVectorsCount, 0)
+                .Select(Enumerable.ToArray);
+        }
+
+        // We want words with more letters (and among these, words with more "rare" letters) to appear first, to reduce the searching time somewhat.
+        // Applying such a sort, we reduce the total number of triplets to check for anagrams from ~62M to ~29M.
+        // Total number of quadruplets is reduced from 1468M to mere 311M.
+        // And total number of quintuplets becomes reasonable 1412M.
+        // Also, it produces the intended results faster (as these are more likely to contain longer words - e.g. "poultry outwits ants" is more likely than "p o u l t r y o u t w i t s a n t s").
+        // This method basically gives us the 1-norm of the vector in the space rescaled so that the target is [1, 1, ..., 1].
+        private static int GetVectorNorm(Vector<byte> vector, Vector<byte> target)
+        {
+            var norm = 0;
+            for (var i = 0; target[i] != 0; i++)
+            {
+                norm += (LeastCommonMultiple * vector[i]) / target[i];
+            }
+
+            return norm;
+        }
+
+        private static VectorInfo[] FilterVectors(Vector<byte>[] vectors, Vector<byte> target)
+        {
+            return Enumerable.Range(0, vectors.Length)
+                .Where(i => Vector.GreaterThanOrEqualAll(target, vectors[i]))
+                .Select(i => new VectorInfo(vectors[i], GetVectorNorm(vectors[i], target), i))
+                .OrderByDescending(vectorInfo => vectorInfo.Norm)
+                .ToArray();
+        }
+
+        // This method takes most of the time, so everything related to it must be optimized.
+        // In every sequence, next vector always goes after the previous one from dictionary.
+        // E.g. if dictionary is [x, y, z], then only [x, y] sequence could be generated, and [y, x] will never be generated.
+        // That way, the complexity of search goes down by a factor of MaxVectorsCount! (as if [x, y] does not add up to a required target, there is no point in checking [y, x])
+        private IEnumerable<ImmutableStack<int>> GenerateUnorderedSequences(Vector<byte> remainder, int remainderNorm, int allowedRemainingWords, int currentDictionaryPosition)
+        {
+            if (allowedRemainingWords > 1)
+            {
+                var newAllowedRemainingWords = allowedRemainingWords - 1;
+
+                // E.g. if remainder norm is 7, 8 or 9, and allowedRemainingWords is 3,
+                // we need the largest remaining word to have a norm of at least 3
+                var requiredRemainderPerWord = (remainderNorm + allowedRemainingWords - 1) / allowedRemainingWords;
+
+                for (var i = Math.Max(this.NormsIndex[remainderNorm], currentDictionaryPosition); i < this.Dictionary.Length; i++)
+                {
+                    var currentVectorInfo = this.Dictionary[i];
+                    if (currentVectorInfo.Vector == remainder)
+                    {
+                        yield return ImmutableStack.Create(currentVectorInfo.Index);
+                    }
+                    else if (currentVectorInfo.Norm < requiredRemainderPerWord)
+                    {
+                        break;
+                    }
+                    else if (Vector.LessThanOrEqualAll(currentVectorInfo.Vector, remainder))
+                    {
+                        var newRemainder = remainder - currentVectorInfo.Vector;
+                        var newRemainderNorm = remainderNorm - currentVectorInfo.Norm;
+                        foreach (var result in this.GenerateUnorderedSequences(newRemainder, newRemainderNorm, newAllowedRemainingWords, i))
+                        {
+                            yield return result.Push(currentVectorInfo.Index);
+                        }
+                    }
+                }
+            }
+            else
+            {
+                for (var i = Math.Max(this.NormsIndex[remainderNorm], currentDictionaryPosition); i < this.Dictionary.Length; i++)
+                {
+                    var currentVectorInfo = this.Dictionary[i];
+                    if (currentVectorInfo.Vector == remainder)
+                    {
+                        yield return ImmutableStack.Create(currentVectorInfo.Index);
+                    }
+                    else if (currentVectorInfo.Norm < remainderNorm)
+                    {
+                        break;
+                    }
+                }
+            }
+        }
+
+        private struct VectorInfo
+        {
+            public VectorInfo(Vector<byte> vector, int norm, int index)
+            {
+                this.Vector = vector;
+                this.Norm = norm;
+                this.Index = index;
+            }
+
+            public Vector<byte> Vector { get; }
+
+            public int Norm { get; }
+
+            public int Index { get; }
+        }
+    }
+}
--- a/dotnet/WhiteRabbit/WhiteRabbit.csproj
+++ b/dotnet/WhiteRabbit/WhiteRabbit.csproj
@ -9,29 +9,33 @@
    <AppDesignerFolder>Properties</AppDesignerFolder>
    <RootNamespace>WhiteRabbit</RootNamespace>
    <AssemblyName>WhiteRabbit</AssemblyName>
-    <TargetFrameworkVersion>v4.6</TargetFrameworkVersion>
+    <TargetFrameworkVersion>v4.7</TargetFrameworkVersion>
    <FileAlignment>512</FileAlignment>
    <AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects>
+    <AllowUnsafeBlocks>true</AllowUnsafeBlocks>
+    <TargetFrameworkProfile />
  </PropertyGroup>
  <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
-    <PlatformTarget>AnyCPU</PlatformTarget>
+    <PlatformTarget>x64</PlatformTarget>
    <DebugSymbols>true</DebugSymbols>
    <DebugType>full</DebugType>
    <Optimize>false</Optimize>
    <OutputPath>bin\Debug\</OutputPath>
-    <DefineConstants>DEBUG;TRACE</DefineConstants>
+    <DefineConstants>TRACE;DEBUG</DefineConstants>
    <ErrorReport>prompt</ErrorReport>
    <WarningLevel>4</WarningLevel>
    <DocumentationFile>bin\Debug\WhiteRabbit.XML</DocumentationFile>
+    <Prefer32Bit>false</Prefer32Bit>
  </PropertyGroup>
  <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">
-    <PlatformTarget>AnyCPU</PlatformTarget>
+    <PlatformTarget>x64</PlatformTarget>
    <DebugType>pdbonly</DebugType>
    <Optimize>true</Optimize>
    <OutputPath>bin\Release\</OutputPath>
    <DefineConstants>TRACE</DefineConstants>
    <ErrorReport>prompt</ErrorReport>
    <WarningLevel>4</WarningLevel>
+    <Prefer32Bit>false</Prefer32Bit>
  </PropertyGroup>
  <ItemGroup>
    <Reference Include="System" />
@ -39,6 +43,7 @@
      <HintPath>..\packages\System.Collections.Immutable.1.3.1\lib\portable-net45+win8+wp8+wpa81\System.Collections.Immutable.dll</HintPath>
      <Private>True</Private>
    </Reference>
+    <Reference Include="System.Configuration" />
    <Reference Include="System.Core" />
    <Reference Include="System.Numerics" />
    <Reference Include="System.Numerics.Vectors, Version=4.1.2.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL">
@ -53,6 +58,10 @@
    <Reference Include="System.Xml" />
  </ItemGroup>
  <ItemGroup>
+    <Compile Include="ByteArrayEqualityComparer.cs" />
+    <Compile Include="Constants.cs" />
+    <Compile Include="Flattener.cs" />
+    <Compile Include="PhraseSet.cs" />
    <Compile Include="PrecomputedPermutationsGenerator.cs" />
    <Compile Include="PermutationsGenerator.cs" />
    <Compile Include="StringsProcessor.cs" />
@ -60,11 +69,18 @@
    <Compile Include="Properties\AssemblyInfo.cs" />
    <Compile Include="VectorsProcessor.cs" />
    <Compile Include="VectorsConverter.cs" />
+    <Compile Include="Word.cs" />
  </ItemGroup>
  <ItemGroup>
    <None Include="App.config" />
    <None Include="packages.config" />
  </ItemGroup>
+  <ItemGroup>
+    <ProjectReference Include="..\WhiteRabbit.UnmanagedBridge\WhiteRabbit.UnmanagedBridge.vcxproj">
+      <Project>{039f03a0-7e8f-415d-8180-969d24479b44}</Project>
+      <Name>WhiteRabbit.UnmanagedBridge</Name>
+    </ProjectReference>
+  </ItemGroup>
  <Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />
  <!-- To modify your build process, add your task inside one of the targets below and uncomment it. 
       Other similar extension points exist, see Microsoft.Common.targets.
--- a/dotnet/WhiteRabbit/Word.cs
+++ b/dotnet/WhiteRabbit/Word.cs
@ -0,0 +1,53 @@
+namespace WhiteRabbit
+{
+    internal unsafe struct Word
+    {
+        public fixed long Buffers[128];
+
+        public unsafe Word(byte[] word)
+        {
+            var tmpWord = new byte[word.Length + 1];
+            tmpWord[word.Length] = (byte)' ';
+            for (var i = 0; i < word.Length; i++)
+            {
+                tmpWord[i] = word[i];
+            }
+
+            fixed (long* buffersPointer = this.Buffers)
+            {
+                for (var i = 0; i < 32; i++)
+                {
+                    var bytePointer = (byte*)(buffersPointer + 4 * i);
+                    var endPointer = bytePointer + 32;
+                    var currentPointer = bytePointer + i;
+                    for (var j = 0; j < tmpWord.Length && currentPointer < endPointer; j++, currentPointer++)
+                    {
+                        *currentPointer = tmpWord[j];
+                    }
+                }
+
+                buffersPointer[127] = tmpWord.Length * 4;
+            }
+        }
+
+        public unsafe byte[] Original
+        {
+            get
+            {
+                fixed (long* buffersPointer = this.Buffers)
+                {
+                    var length = buffersPointer[127] / 4;
+                    var result = new byte[length];
+                    for (var i = 0; i < length; i++)
+                    {
+                        result[i] = ((byte*)buffersPointer)[i];
+                    }
+
+                    return result;
+                }
+            }
+        }
+
+        private static Word Empty { get; } = new Word();
+    }
+}
--- a/dotnet/WhiteRabbit/packages.config
+++ b/dotnet/WhiteRabbit/packages.config
Author	SHA1	Message	Date
Inga 🏳‍🌈	e54450c4b6	Code cleanup	8 years ago
Inga 🏳‍🌈	b04d5688ed	Code cleanup	8 years ago
Inga 🏳‍🌈	f2015b3d01	Refactored macros to templates	8 years ago
Inga 🏳‍🌈	a41a57b0e4	Microoptimization	8 years ago
Inga 🏳‍🌈	bbc7761333	Hash checking optimization	8 years ago
Inga 🏳‍🌈	b6afbe9528	Performance measurements updated Previous value of 81min for 7-word anagrams was a mistype; actual value was 101min.	8 years ago
Inga 🏳‍🌈	f3dbd85b2f	Microoptimization	8 years ago
Inga 🏳‍🌈	332188d3e9	Optimization: reduced number of pinning operations	8 years ago
Inga 🏳‍🌈	5a0026ff80	Refactoring	8 years ago
Inga 🏳‍🌈	5ffaa1090a	Microoptimization	8 years ago
Inga 🏳‍🌈	b667aa8830	Optimized MD5 computation (loop unrolling)	8 years ago
Inga 🏳‍🌈	c919172ac7	microoptimization	8 years ago
Inga 🏳‍🌈	7f6aeb21bf	Updated performance measurements	8 years ago
Inga 🏳‍🌈	759abca0d0	Updated performance measurements	8 years ago
Inga 🏳‍🌈	77d7071a18	Refactoring	8 years ago
Inga 🏳‍🌈	9423f1e34f	Significantly reduced number of allocations	8 years ago
Inga 🏳‍🌈	16bc5f2c98	Optimized memory allocations (MD5 is stored inside a PhraseSet)	8 years ago
Inga 🏳‍🌈	05040b030f	PLINQ optimizations	8 years ago
Inga 🏳‍🌈	efd160cb97	Refactoring	8 years ago
Inga 🏳‍🌈	d13b94c3b6	Optimization	8 years ago
Inga 🏳‍🌈	705baf969c	Optimized initialization, support for 10-word phrases, updated performance measurements	8 years ago
Inga 🏳‍🌈	7aa6469c72	PhraseSet size set back to 16	8 years ago
Inga 🏳‍🌈	ec79a3f41b	Forgotten files	8 years ago
Inga 🏳‍🌈	fd752f88fc	More FillPhraseSet optimizations	8 years ago
Inga 🏳‍🌈	e8544bbd71	AVX2 optimizations, loop unrolling	8 years ago
Inga 🏳‍🌈	7e4c23d467	PhraseSet.FillPhraseSet moved out to unmanaged code	8 years ago
Inga 🏳‍🌈	27a5b13e58	static	8 years ago
Inga 🏳‍🌈	2d1dcc132c	FillPhraseSet optimizations	8 years ago
Inga 🏳‍🌈	bb22805cbc	PhraseSet.FillPhraseSet moved out to unmanaged code	8 years ago
Inga 🏳‍🌈	9866d8ef7f	PhraseSet.FillPhraseSet moved out to unmanaged code	8 years ago
Inga 🏳‍🌈	c5e129ffd9	PhraseSet.FillPhraseSet rewritten to use pointers only	8 years ago
Inga 🏳‍🌈	a154b211a5	PhraseSet filling moved out to separate method	8 years ago
Inga 🏳‍🌈	cbb7ccb59b	Refactored vector-to-words conversion to lower-level code	8 years ago
Inga 🏳‍🌈	0090bce443	NumberOfPhrases moved out to UnmanagedBridge	8 years ago
Inga 🏳‍🌈	54c32d07da	Permutation filters implemented (to avoid duplicate phrases)	8 years ago
Inga 🏳‍🌈	4bd1b36d94	AVX2 fixes	8 years ago
Inga 🏳‍🌈	bb6275672f	Compatibility fix for AVX2 CPUs	8 years ago
Inga 🏳‍🌈	8552a17b21	Microoptimization	8 years ago
Inga 🏳‍🌈	d8ef0310df	Memory usage optimized	8 years ago
Inga 🏳‍🌈	8a3ceaf34c	Retargeted to W10/toolset 141/.NET 4.7; updated performance for dual-core CPU	8 years ago
Inga 🏳‍🌈	fec5b2ebac	8-word anagrams performance	8 years ago
Inga 🏳‍🌈	35c12f649d	PhraseSet initialization optimization	8 years ago
Inga 🏳‍🌈	041983d168	Updated README	8 years ago
Inga 🏳‍🌈	ee98e2e87f	Fixed a mistype	8 years ago
Inga 🏳‍🌈	f642e25bb3	Microoptimization	8 years ago
Inga 🏳‍🌈	55d721ffae	Optimized anagrams count computation	8 years ago
Inga 🏳‍🌈	5584ea843d	Performance fixes	8 years ago
Inga 🏳‍🌈	4179000127	MD5 SIMD optimizations	8 years ago
Inga 🏳‍🌈	836361a66c	Refactoring + SIMD/AVX support	8 years ago
Inga 🏳‍🌈	c60d4cbcaf	md5.cpp refactored	8 years ago
Inga 🏳‍🌈	db2a783501	microoptimization	8 years ago
Inga 🏳‍🌈	bcd6a1d053	Microoptimization: one part of MD5 is enough for search	8 years ago
Inga 🏳‍🌈	4702fba26b	Phrases sent to unmanagedbridge in batches of 8	8 years ago
Inga 🏳‍🌈	c79a41732d	md5.cpp refactored	8 years ago
Inga 🏳‍🌈	fba2d3e10e	Refactored to use phrasesets	8 years ago
Inga 🏳‍🌈	15e2687f31	Some optimization	8 years ago
Inga 🏳‍🌈	d43578de1c	MD5 computation moved out to VC++ project	8 years ago
Inga 🏳‍🌈	f26d9abbbe	Additional performance info	8 years ago
Inga 🏳‍🌈	5c777d49db	Microoptimization: reduced number of allocations	8 years ago
Inga 🏳‍🌈	6b8c2f56b6	Code cleanup	8 years ago
Inga 🏳‍🌈	581572fa4e	Further MD5 optimizations	8 years ago
Inga 🏳‍🌈	268f5ef1ef	Sources moved to dotnet folder	8 years ago
Inga 🏳‍🌈	25779d3e0c	Cosmetic fixes	8 years ago
Inga 🏳‍🌈	9a158edc8b	Optimization; GeneratePermutations is called after flattening	8 years ago
Inga 🏳‍🌈	97d73e54af	Microoptimization + code cleanup	8 years ago
Inga 🏳‍🌈	e021ebbe27	Safety checks	8 years ago
Inga 🏳‍🌈	3429ad83cf	Further unsafe optimizations	8 years ago
Inga 🏳‍🌈	e5c1e743bc	Further MD5 optimizations	8 years ago
Inga 🏳‍🌈	d9c2cad4b6	Optimized MD5 hash computation	8 years ago
Inga 🏳‍🌈	8cefd666fe	More hashes!	8 years ago
Inga 🏳‍🌈	4bc3e45b8d	Code cleanup / fixes	8 years ago
Inga 🏳‍🌈	e2f109d1b9	Challenge parameters moved out to config	8 years ago
Inga 🏳‍🌈	1327814fd1	Implemented all anagrams output in debug mode	8 years ago
Inga 🏳‍🌈	325ae0b314	Another 2x speedup by hardcoding flattening for fixed arrays	8 years ago
Inga 🏳‍🌈	b570a06f2b	Improved debugging	8 years ago
Inga 🏳‍🌈	a3a426f023	Improved performance (dictionary => array)	8 years ago
Inga 🏳‍🌈	1a45eece0f	Code cleanup; implementation notes added	8 years ago
Inga 🏳‍🌈	760c1b5b13	Further optimizations	8 years ago
Inga 🏳‍🌈	91f543aa84	Microoptimization: performance-critical methods are made static	8 years ago
Inga 🏳‍🌈	fc5164fde2	Minor code cleanup + microoptimization + readme update	8 years ago
Inga 🏳‍🌈	b092c19989	As used vector norm is linear, dot product is not needed	8 years ago
Inga 🏳‍🌈	3116f22082	Binary search optimization; memory usage optimization	8 years ago
Inga 🏳‍🌈	c94b6b3eaa	Further optimization	8 years ago
Inga 🏳‍🌈	7296d71187	Microoptimizations	8 years ago
Inga 🏳‍🌈	ccf6f216c3	Additional debug info	8 years ago
Inga 🏳‍🌈	5d2cd465d4	New optimization: there is no point in checking too small vectors	8 years ago
Inga 🏳‍🌈	8210dd27b3	Disabled Prefer32Bit, which prevented SIMD vector optimizations	8 years ago
Inga 🏳‍🌈	937ce45af2	Code cleanup; additional information	8 years ago
Inga 🏳‍🌈	c66ab408ff	Code cleanup	8 years ago
Inga 🏳‍🌈	ae4a3332ce	Added information on 5-word anagrams	8 years ago
Inga 🏳‍🌈	f9151c329d	Switched to Parallel LINQ	8 years ago
Inga 🏳‍🌈	4964fb7673	Code cleanup	8 years ago
Inga 🏳‍🌈	92d995ac79	Switched to BouncyCastle for MD5	8 years ago
Inga 🏳‍🌈	2bb80c719a	Words are now byte arrays instead of strings	8 years ago