Memory usage is minimal (for that kind of task), less than 10MB.
Memory usage is minimal (for that kind of task), less than 10MB (25MB for MaxNumberOfWords = 8).
It is also somewhat optimized for likely intended phrases, as anagrams consisting of longer words are generated first.
It is also somewhat optimized for likely intended phrases, as anagrams consisting of longer words are generated first.
That's why the given hashes are solved much sooner than it takes to check all anagrams.
That's why the given hashes are solved much sooner than it takes to check all anagrams.
Anagrams generation is not parallelized, as even single-threaded performance for 4-word anagrams is high enough; and 5-word (or larger) anagrams are frequent enough for most of the time being spent on computing hashes, with full CPU load.
Anagrams generation is not parallelized, as even single-threaded performance for 4-word anagrams is high enough; and 5-word (or larger) anagrams are frequent enough for most of the time being spent on computing hashes, with full CPU load.
Multi-threaded performance with RyuJIT (.NET 4.6, 64-bit system) on laptop with dual-core Sandy Bridge @2.6GHz (without AVX2 support) is as follows (excluding initialization time of 0.2 seconds), for different maximum allowed words in an anagram:
Multi-threaded performance with RyuJIT (.NET 4.6, 64-bit system) on i5-6500 is as follows (excluding initialization time of 0.2 seconds), for different maximum allowed words in an anagram:
Number of words|Time to check all anagrams no longer than that|Time to solve "easy" hash|Time to solve "more difficult" hash|Time to solve "hard" hash|Number of anagrams no longer than that (see note below)
Number of words|Time to check all anagrams no longer than that|Time to solve "easy" hash|Time to solve "more difficult" hash|Time to solve "hard" hash|Number of unique anagrams no longer than that
Note that all measurements were done on a Release build; Debug build is significantly slower.
Note that all measurements were done on a Release build; Debug build is significantly slower.
For comparison, certain other solutions available on GitHub seem to require 3 hours to find all 3-word anagrams. This solution is faster by 6-7 orders of magnitude (it finds and checks all 4-word anagrams in 1/10000th fraction of time required for other solution just to find all 3-word anagrams, with no MD5 calculations).
For comparison, certain other solutions available on GitHub seem to require 3 hours to find all 3-word anagrams. This solution is faster by 6-7 orders of magnitude (it finds and checks all 4-word anagrams in 1/10000th fraction of time required for other solution just to find all 3-word anagrams, with no MD5 calculations).
Also, note that anagram counts are inflated for the sake of code simplicity.
E.g. for phrase "aabbc" and dictionary [ab, ba, c] there are four possible set of words adding up to the source phrase: [ab, ab, c], [ab, ba, c], [ba, ab, c], [ba, ba, c].
My implementation regards these sets as sets of different words, and applies all possible permutations to the every set, even if it will result in the same set.
For the example above, my application would produce 24 anagrams (with six permutations for every of the four sets), although actually there are only 12 different anagrams.
Conditional compilation symbols
Conditional compilation symbols
===============================
===============================
* Define `SINGLE_THREADED` to use standard enumerables instead of ParallelEnumerable (useful for profiling).
* Define `SINGLE_THREADED` to use standard enumerables instead of ParallelEnumerable (useful for profiling).
* Define `DEBUG`, or build in debug mode, to get the total number of anagrams (not optimized, memory-hogging).
* Define `DEBUG`, or build in debug mode, to get the total number of anagrams (not optimized).