Compare commits

..

1 Commits

  1. 43
      README.md
  2. 4
      dotnet/TrustPilotChallenge.sln
  3. 31
      dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.cpp
  4. 6
      dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.h
  5. 15
      dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.vcxproj
  6. 9
      dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.vcxproj.filters
  7. 3
      dotnet/WhiteRabbit.UnmanagedBridge/constants.h
  8. 443
      dotnet/WhiteRabbit.UnmanagedBridge/md5.cpp
  9. 2
      dotnet/WhiteRabbit.UnmanagedBridge/md5.h
  10. 86
      dotnet/WhiteRabbit.UnmanagedBridge/phraseset.cpp
  11. 3
      dotnet/WhiteRabbit.UnmanagedBridge/phraseset.h
  12. 4
      dotnet/WhiteRabbit/App.config
  13. 6
      dotnet/WhiteRabbit/Constants.cs
  14. 146
      dotnet/WhiteRabbit/Flattener.cs
  15. 21
      dotnet/WhiteRabbit/MD5Digest.cs
  16. 117
      dotnet/WhiteRabbit/PhraseSet.cs
  17. 132
      dotnet/WhiteRabbit/PrecomputedPermutationsGenerator.cs
  18. 73
      dotnet/WhiteRabbit/Program.cs
  19. 93
      dotnet/WhiteRabbit/StringsProcessor.cs
  20. 7
      dotnet/WhiteRabbit/VectorsProcessor.cs
  21. 5
      dotnet/WhiteRabbit/WhiteRabbit.csproj
  22. 53
      dotnet/WhiteRabbit/Word.cs

@ -34,34 +34,49 @@ WhiteRabbit.exe < wordlist
Performance Performance
=========== ===========
Memory usage is minimal (for that kind of task), less than 10MB (25MB for MaxNumberOfWords = 8). Memory usage is minimal (for that kind of task), less than 10MB.
It is also somewhat optimized for likely intended phrases, as anagrams consisting of longer words are generated first. It is also somewhat optimized for likely intended phrases, as anagrams consisting of longer words are generated first.
That's why the given hashes are solved much sooner than it takes to check all anagrams. That's why the given hashes are solved much sooner than it takes to check all anagrams.
Anagrams generation is not parallelized, as even single-threaded performance for 4-word anagrams is high enough; and 5-word (or larger) anagrams are frequent enough for most of the time being spent on computing hashes, with full CPU load. Anagrams generation is not parallelized, as even single-threaded performance for 4-word anagrams is high enough; and 5-word (or larger) anagrams are frequent enough for most of the time being spent on computing hashes, with full CPU load.
Multi-threaded performance with RyuJIT (.NET 4.6, 64-bit system) on i5-6500 is as follows (excluding initialization time of 0.2 seconds), for different maximum allowed words in an anagram: Multi-threaded performance with RyuJIT (.NET 4.6, 64-bit system) on quad-core Sandy Bridge @2.8GHz (without AVX2 support) is as follows (excluding initialization time of 0.2 seconds), for different maximum allowed words in an anagram:
Number of words|Time to check all anagrams no longer than that|Time to solve "easy" hash|Time to solve "more difficult" hash|Time to solve "hard" hash|Number of unique anagrams no longer than that Number of words|Time to check all anagrams no longer than that|Time to solve "easy" hash|Time to solve "more difficult" hash|Time to solve "hard" hash|Number of anagrams no longer than that (see note below)
---------------|----------------------------------------------|-------------------------|-----------------------------------|-------------------------|--------------------------------------------- ---------------|----------------------------------------------|-------------------------|-----------------------------------|-------------------------|-------------------------------------------------------
3|0.04s||||4560 3|Fractions of a second||||4560
4|0.45s|||0.08s|7,431,984 4|0.6s|||0.1s|7,433,016
5|9.6s|0.15s|0.06s|0.27s|1,347,437,484 5|60s|||1.5s|1,348,876,896
6|4.5 minutes|0.85s|0.17s|2.05s|58,405,904,844 6|45 minutes|||21s|58,837,302,096
7|83 minutes|4.7s|0.6s|13.3s|1,070,307,744,114 7|10 hours (?)|1.5 minutes|8s|4.5 minutes|1,108,328,708,976
8|14 hours|17.6s|1.8s|55s|10,893,594,396,594 8|||||12,089,249,231,856
9||45s|4s|2.5 minutes|70,596,864,409,954 9|||||88,977,349,731,696
10||80s|5.8s|4.8 minutes|314,972,701,475,754 10|||||482,627,715,786,096
11|||||2,030,917,440,675,696
12|||||6,813,402,098,518,896
13|||||18,437,325,782,691,696
14|||||40,367,286,468,925,296
15|||||71,561,858,517,565,296
16|||||103,280,807,987,773,296
17|||||123,910,678,817,341,296
18|||||130,313,052,523,069,296
Note that all measurements were done on a Release build; Debug build is significantly slower. Note that all measurements were done on a Release build; Debug build is significantly slower.
For comparison, certain other solutions available on GitHub seem to require 3 hours to find all 3-word anagrams. This solution is faster by 6-7 orders of magnitude (it finds and checks all 4-word anagrams in 1/10000th fraction of time required for other solution just to find all 3-word anagrams, with no MD5 calculations). For comparison, certain other solutions available on GitHub seem to require 3 hours to find all 3-word anagrams. This solution is faster by 6-7 orders of magnitude (it finds and checks all 4-word anagrams in 1/10000th fraction of time required for other solution just to find all 3-word anagrams, with no MD5 calculations).
Also, note that anagram counts are inflated for the sake of code simplicity.
E.g. for phrase "aabbc" and dictionary [ab, ba, c] there are four possible set of words adding up to the source phrase: [ab, ab, c], [ab, ba, c], [ba, ab, c], [ba, ba, c].
My implementation regards these sets as sets of different words, and applies all possible permutations to the every set, even if it will result in the same set.
For the example above, my application would produce 24 anagrams (with six permutations for every of the four sets), although actually there are only 12 different anagrams.
Conditional compilation symbols Conditional compilation symbols
=============================== ===============================
* Define `DEBUG`, or build in debug mode, to get the total number of anagrams (not optimized). * Define `SINGLE_THREADED` to use standard enumerables instead of ParallelEnumerable (useful for profiling).
* Define `DEBUG`, or build in debug mode, to get the total number of anagrams (not optimized, memory-hogging).
Implementation notes Implementation notes
==================== ====================

@ -1,7 +1,7 @@
 
Microsoft Visual Studio Solution File, Format Version 12.00 Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio 15 # Visual Studio 14
VisualStudioVersion = 15.0.26403.3 VisualStudioVersion = 14.0.25420.1
MinimumVisualStudioVersion = 10.0.40219.1 MinimumVisualStudioVersion = 10.0.40219.1
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "WhiteRabbit", "WhiteRabbit\WhiteRabbit.csproj", "{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}" Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "WhiteRabbit", "WhiteRabbit\WhiteRabbit.csproj", "{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}"
EndProject EndProject

@ -4,29 +4,22 @@
#include "WhiteRabbit.UnmanagedBridge.h" #include "WhiteRabbit.UnmanagedBridge.h"
#include "md5.h" #include "md5.h"
#include "phraseset.h"
void WhiteRabbitUnmanagedBridge::MD5Unmanaged::ComputeMD5(unsigned __int32 * input, unsigned __int32 * expected) void WhiteRabbitUnmanagedBridge::MD5Unmanaged::ComputeMD5(unsigned __int32 * input, unsigned __int32 * output)
{ {
#if AVX2 #if AVX2
md5(input + 0 * 8 * 8, expected); md5(input + 0 * 8 * 8, output + 0 * 8);
#elif SIMD #elif SIMD
md5(input + 0 * 8 * 4); md5(input + 0 * 8 * 4, output + 0 * 4);
md5(input + 1 * 8 * 4); md5(input + 1 * 8 * 4, output + 1 * 4);
if (input[2 * 8 * 4] != 0)
{
md5(input + 2 * 8 * 4);
md5(input + 3 * 8 * 4);
}
#else #else
for (int i = 0; i < 16; i++) md5(input + 0 * 8, output + 0);
{ md5(input + 1 * 8, output + 1);
md5(input + i * 8); md5(input + 2 * 8, output + 2);
} md5(input + 3 * 8, output + 3);
md5(input + 4 * 8, output + 4);
md5(input + 5 * 8, output + 5);
md5(input + 6 * 8, output + 6);
md5(input + 7 * 8, output + 7);
#endif #endif
} }
void WhiteRabbitUnmanagedBridge::MD5Unmanaged::FillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords)
{
fillPhraseSet(initialBufferPointer, bufferPointer, allWordsPointer, wordIndexes, permutationsPointer, numberOfWords);
}

@ -2,8 +2,6 @@
#pragma once #pragma once
#include "constants.h"
using namespace System; using namespace System;
namespace WhiteRabbitUnmanagedBridge { namespace WhiteRabbitUnmanagedBridge {
@ -11,8 +9,6 @@ namespace WhiteRabbitUnmanagedBridge {
public ref class MD5Unmanaged public ref class MD5Unmanaged
{ {
public: public:
literal int PhrasesPerSet = PHRASES_PER_SET; static void ComputeMD5(unsigned int* input, unsigned int* output);
static void ComputeMD5(unsigned int* input, unsigned __int32 * expected);
static void FillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords);
}; };
} }

@ -20,37 +20,37 @@
</ItemGroup> </ItemGroup>
<PropertyGroup Label="Globals"> <PropertyGroup Label="Globals">
<ProjectGuid>{039F03A0-7E8F-415D-8180-969D24479B44}</ProjectGuid> <ProjectGuid>{039F03A0-7E8F-415D-8180-969D24479B44}</ProjectGuid>
<TargetFrameworkVersion>v4.7</TargetFrameworkVersion> <TargetFrameworkVersion>v4.5</TargetFrameworkVersion>
<Keyword>ManagedCProj</Keyword> <Keyword>ManagedCProj</Keyword>
<RootNamespace>WhiteRabbitUnmanagedBridge</RootNamespace> <RootNamespace>WhiteRabbitUnmanagedBridge</RootNamespace>
<WindowsTargetPlatformVersion>10.0.10586.0</WindowsTargetPlatformVersion> <WindowsTargetPlatformVersion>8.1</WindowsTargetPlatformVersion>
</PropertyGroup> </PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" /> <Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration"> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType> <ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries> <UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset> <PlatformToolset>v140</PlatformToolset>
<CLRSupport>true</CLRSupport> <CLRSupport>true</CLRSupport>
<CharacterSet>Unicode</CharacterSet> <CharacterSet>Unicode</CharacterSet>
</PropertyGroup> </PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration"> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType> <ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries> <UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset> <PlatformToolset>v140</PlatformToolset>
<CLRSupport>true</CLRSupport> <CLRSupport>true</CLRSupport>
<CharacterSet>Unicode</CharacterSet> <CharacterSet>Unicode</CharacterSet>
</PropertyGroup> </PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration"> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType> <ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries> <UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset> <PlatformToolset>v140</PlatformToolset>
<CLRSupport>true</CLRSupport> <CLRSupport>true</CLRSupport>
<CharacterSet>Unicode</CharacterSet> <CharacterSet>Unicode</CharacterSet>
</PropertyGroup> </PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration"> <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType> <ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries> <UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset> <PlatformToolset>v140</PlatformToolset>
<CLRSupport>true</CLRSupport> <CLRSupport>true</CLRSupport>
<CharacterSet>Unicode</CharacterSet> <CharacterSet>Unicode</CharacterSet>
</PropertyGroup> </PropertyGroup>
@ -133,9 +133,7 @@
</Link> </Link>
</ItemDefinitionGroup> </ItemDefinitionGroup>
<ItemGroup> <ItemGroup>
<ClInclude Include="constants.h" />
<ClInclude Include="md5.h" /> <ClInclude Include="md5.h" />
<ClInclude Include="phraseset.h" />
<ClInclude Include="resource.h" /> <ClInclude Include="resource.h" />
<ClInclude Include="Stdafx.h" /> <ClInclude Include="Stdafx.h" />
<ClInclude Include="WhiteRabbit.UnmanagedBridge.h" /> <ClInclude Include="WhiteRabbit.UnmanagedBridge.h" />
@ -143,7 +141,6 @@
<ItemGroup> <ItemGroup>
<ClCompile Include="AssemblyInfo.cpp" /> <ClCompile Include="AssemblyInfo.cpp" />
<ClCompile Include="md5.cpp" /> <ClCompile Include="md5.cpp" />
<ClCompile Include="phraseset.cpp" />
<ClCompile Include="Stdafx.cpp"> <ClCompile Include="Stdafx.cpp">
<PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">Create</PrecompiledHeader> <PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">Create</PrecompiledHeader>
<PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">Create</PrecompiledHeader> <PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">Create</PrecompiledHeader>

@ -23,12 +23,6 @@
<ClInclude Include="md5.h"> <ClInclude Include="md5.h">
<Filter>Header Files</Filter> <Filter>Header Files</Filter>
</ClInclude> </ClInclude>
<ClInclude Include="constants.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="phraseset.h">
<Filter>Header Files</Filter>
</ClInclude>
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>
<ClCompile Include="AssemblyInfo.cpp"> <ClCompile Include="AssemblyInfo.cpp">
@ -43,8 +37,5 @@
<ClCompile Include="md5.cpp"> <ClCompile Include="md5.cpp">
<Filter>Source Files</Filter> <Filter>Source Files</Filter>
</ClCompile> </ClCompile>
<ClCompile Include="phraseset.cpp">
<Filter>Source Files</Filter>
</ClCompile>
</ItemGroup> </ItemGroup>
</Project> </Project>

@ -1,3 +0,0 @@
#pragma once
#define PHRASES_PER_SET 16

@ -5,155 +5,186 @@
#pragma unmanaged #pragma unmanaged
struct MD5Vector #if AVX2
typedef __m256i MD5Vector;
#define OP_XOR(a, b) _mm256_xor_si256(a, b)
#define OP_AND(a, b) _mm256_and_si256(a, b)
#define OP_ANDNOT(a, b) _mm256_andnot_si256(a, b)
#define OP_OR(a, b) _mm256_or_si256(a, b)
#define OP_ADD(a, b) _mm256_add_epi32(a, b)
#define OP_ROT(a, r) OP_OR(_mm256_slli_epi32(a, r), _mm256_srli_epi32(a, 32 - (r)))
#define OP_BLEND(a, b, x) OP_OR(OP_AND(x, b), OP_ANDNOT(x, a))
#define CREATE_VECTOR(a) _mm256_set1_epi32(a)
#define CREATE_VECTOR_FROM_INPUT(input, offset) _mm256_set_epi32( \
input[offset + 0 * 8], \
input[offset + 1 * 8], \
input[offset + 2 * 8], \
input[offset + 3 * 8], \
input[offset + 4 * 8], \
input[offset + 5 * 8], \
input[offset + 6 * 8], \
input[offset + 7 * 8])
#define WRITE_TO_OUTPUT(a, output) \
((unsigned __int64*)output)[0] = a.m256i_u64[0]; \
((unsigned __int64*)output)[1] = a.m256i_u64[1]; \
((unsigned __int64*)output)[2] = a.m256i_u64[2]; \
((unsigned __int64*)output)[3] = a.m256i_u64[3];
#elif SIMD
typedef __m128i MD5Vector;
#define OP_XOR(a, b) _mm_xor_si128(a, b)
#define OP_AND(a, b) _mm_and_si128(a, b)
#define OP_ANDNOT(a, b) _mm_andnot_si128(a, b)
#define OP_OR(a, b) _mm_or_si128(a, b)
#define OP_ADD(a, b) _mm_add_epi32(a, b)
#define OP_ROT(a, r) OP_OR(_mm_slli_epi32(a, r), _mm_srli_epi32(a, 32 - (r)))
#define OP_BLEND(a, b, x) OP_OR(OP_AND(x, b), OP_ANDNOT(x, a))
//#define OP_BLEND(a, b, x) OP_XOR(a, OP_AND(x, OP_XOR(b, a)))
#define CREATE_VECTOR(a) _mm_set1_epi32(a)
#define CREATE_VECTOR_FROM_INPUT(input, offset) _mm_set_epi32( \
input[offset + 3 * 8], \
input[offset + 2 * 8], \
input[offset + 1 * 8], \
input[offset + 0 * 8])
#define WRITE_TO_OUTPUT(a, output) \
((unsigned __int64*)output)[0] = a.m128i_u64[0]; \
((unsigned __int64*)output)[1] = a.m128i_u64[1];
#else
typedef unsigned int MD5Vector;
#define OP_XOR(a, b) (a) ^ (b)
#define OP_AND(a, b) (a) & (b)
#define OP_ANDNOT(a, b) ~(a) & (b)
#define OP_OR(a, b) (a) | (b)
#define OP_ADD(a, b) (a) + (b)
#define OP_ROT(a, r) _rotl(a, r)
#define OP_BLEND(a, b, x) ((x) & (b)) | (~(x) & (a))
#define CREATE_VECTOR(a) a
#define CREATE_VECTOR_FROM_INPUT(input, offset) (input[offset])
#define WRITE_TO_OUTPUT(a, output) \
output[0] = a;
#endif
#define OP_NEG(a) OP_ANDNOT(a, CREATE_VECTOR(0xffffffff))
typedef struct {
unsigned int K[64];
unsigned int Init[4];
} MD5Parameters;
static const MD5Parameters Parameters = {
{ {
__m256i m_V0; 0xd76aa478,
__m256i m_V1; 0xe8c7b756,
__forceinline MD5Vector() {} 0x242070db,
__forceinline MD5Vector(__m256i C0, __m256i C1) :m_V0(C0), m_V1(C1) {} 0xc1bdceee,
0xf57c0faf,
__forceinline MD5Vector MXor(MD5Vector R) const 0x4787c62a,
{ 0xa8304613,
return MD5Vector(_mm256_xor_si256(m_V0, R.m_V0), _mm256_xor_si256(m_V1, R.m_V1)); 0xfd469501,
} 0x698098d8,
0x8b44f7af,
__forceinline MD5Vector MAnd(MD5Vector R) const 0xffff5bb1,
{ 0x895cd7be,
return MD5Vector(_mm256_and_si256(m_V0, R.m_V0), _mm256_and_si256(m_V1, R.m_V1)); 0x6b901122,
} 0xfd987193,
0xa679438e,
__forceinline MD5Vector MAndNot(MD5Vector R) const 0x49b40821,
{ 0xf61e2562,
return MD5Vector(_mm256_andnot_si256(m_V0, R.m_V0), _mm256_andnot_si256(m_V1, R.m_V1)); 0xc040b340,
} 0x265e5a51,
0xe9b6c7aa,
__forceinline const MD5Vector MOr(const MD5Vector R) const 0xd62f105d,
{ 0x02441453,
return MD5Vector(_mm256_or_si256(m_V0, R.m_V0), _mm256_or_si256(m_V1, R.m_V1)); 0xd8a1e681,
} 0xe7d3fbc8,
0x21e1cde6,
__forceinline const MD5Vector MAdd(const MD5Vector R) const 0xc33707d6,
{ 0xf4d50d87,
return MD5Vector(_mm256_add_epi32(m_V0, R.m_V0), _mm256_add_epi32(m_V1, R.m_V1)); 0x455a14ed,
} 0xa9e3e905,
0xfcefa3f8,
__forceinline const MD5Vector MShiftLeft(const int shift) const 0x676f02d9,
{ 0x8d2a4c8a,
return MD5Vector(_mm256_slli_epi32(m_V0, shift), _mm256_slli_epi32(m_V1, shift)); 0xfffa3942,
} 0x8771f681,
0x6d9d6122,
__forceinline const MD5Vector MShiftRight(const int shift) const 0xfde5380c,
{ 0xa4beea44,
return MD5Vector(_mm256_srli_epi32(m_V0, shift), _mm256_srli_epi32(m_V1, shift)); 0x4bdecfa9,
} 0xf6bb4b60,
0xbebfbc70,
template<int imm8> 0x289b7ec6,
__forceinline const MD5Vector Permute() const 0xeaa127fa,
{ 0xd4ef3085,
return MD5Vector(_mm256_permute4x64_epi64(m_V0, imm8), _mm256_permute4x64_epi64(m_V1, imm8)); 0x04881d05,
} 0xd9d4d039,
0xe6db99e5,
__forceinline const MD5Vector CompareEquality32(const __m256i other) const 0x1fa27cf8,
0xc4ac5665,
0xf4292244,
0x432aff97,
0xab9423a7,
0xfc93a039,
0x655b59c3,
0x8f0ccc92,
0xffeff47d,
0x85845dd1,
0x6fa87e4f,
0xfe2ce6e0,
0xa3014314,
0x4e0811a1,
0xf7537e82,
0xbd3af235,
0x2ad7d2bb,
0xeb86d391,
},
{ {
return MD5Vector(_mm256_cmpeq_epi32(m_V0, other), _mm256_cmpeq_epi32(m_V1, other)); 0x67452301,
} 0xefcdab89,
0x98badcfe,
__forceinline void WriteMoveMask8(__int32 * output) const 0x10325476,
{ },
output[0] = _mm256_movemask_epi8(m_V0);
output[1] = _mm256_movemask_epi8(m_V1);
}
}; };
__forceinline const MD5Vector OP_XOR(const MD5Vector a, const MD5Vector b) { return a.MXor(b); } #define Blend(a, b, x) OP_BLEND(a, b, x)
__forceinline const MD5Vector OP_AND(const MD5Vector a, const MD5Vector b) { return a.MAnd(b); } #define Xor(a, b, c) OP_XOR(a, OP_XOR(b, c))
__forceinline const MD5Vector OP_ANDNOT(const MD5Vector a, const MD5Vector b) { return a.MAndNot(b); } #define I(a, b, c) OP_XOR(a, OP_OR(b, OP_NEG(c)))
__forceinline const MD5Vector OP_OR(const MD5Vector a, const MD5Vector b) { return a.MOr(b); }
__forceinline const MD5Vector OP_ADD(const MD5Vector a, const MD5Vector b) { return a.MAdd(b); }
template<int r>
__forceinline const MD5Vector OP_ROT(const MD5Vector a) { return OP_OR(a.MShiftLeft(r), a.MShiftRight(32 - (r))); }
__forceinline const MD5Vector OP_BLEND(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_OR(OP_AND(x, b), OP_ANDNOT(x, a)); }
__forceinline const MD5Vector CREATE_VECTOR(const int a) { return MD5Vector(_mm256_set1_epi32(a), _mm256_set1_epi32(a)); }
__forceinline const MD5Vector CREATE_VECTOR_FROM_INPUT(const unsigned __int32* input, const size_t offset)
{
return MD5Vector(
_mm256_i32gather_epi32((int*)(input + offset), _mm256_set_epi32(7 * 8, 6 * 8, 5 * 8, 4 * 8, 3 * 8, 2 * 8, 1 * 8, 0 * 8), 4),
_mm256_i32gather_epi32((int*)(input + offset), _mm256_set_epi32(15 * 8, 14 * 8, 13 * 8, 12 * 8, 11 * 8, 10 * 8, 9 * 8, 8 * 8), 4));
}
#define WRITE_TO_OUTPUT(a, output, expected) \
a.Permute<0 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output); \
a.Permute<1 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 2); \
a.Permute<2 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 4); \
a.Permute<3 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 6); \
output[8] = _mm256_movemask_epi8(_mm256_cmpeq_epi8(*((__m256i*)output), _mm256_setzero_si256()));
__forceinline void WriteToOutput(const MD5Vector a, __int32 * output, __m256i * expected)
{
a.Permute<0 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
a.Permute<1 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
a.Permute<2 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
a.Permute<3 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
output[8] = _mm256_movemask_epi8(_mm256_cmpeq_epi8(*((__m256i*)output), _mm256_setzero_si256()));
}
const MD5Vector Ones = CREATE_VECTOR(0xffffffff);
__forceinline const MD5Vector OP_NEG(const MD5Vector a) { return OP_ANDNOT(a, Ones); }
__forceinline const MD5Vector Blend(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_BLEND(a, b, x); }
__forceinline const MD5Vector Xor(const MD5Vector a, const MD5Vector b, const MD5Vector c) { return OP_XOR(a, OP_XOR(b, c)); }
__forceinline const MD5Vector I(const MD5Vector a, const MD5Vector b, const MD5Vector c) { return OP_XOR(a, OP_OR(b, OP_NEG(c))); }
template<int r> #define StepOuter(r, a, b, x) \
__forceinline const MD5Vector StepOuter(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_ADD(b, OP_ROT<r>(x)); } a = x; \
a = OP_ADD(b, OP_ROT(a, r));
template<int r, unsigned __int32 k> #define Step1(r, a, b, c, d, k, w) StepOuter(r, a, b, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))))
__forceinline const MD5Vector Step1(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) { #define Step1E(r, a, b, c, d, k) StepOuter(r, a, b, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), a)))
return StepOuter<r>(a, b, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step1(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
return StepOuter<r>(a, b, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), a)));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step2(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
return StepOuter<r>(a, c, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step2(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
return StepOuter<r>(a, c, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), a)));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step3(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
return StepOuter<r>(a, b, OP_ADD(Xor(b, c, d), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
}
template<int r, unsigned __int32 k> #define Step2(r, a, b, c, d, k, w) StepOuter(r, a, c, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))))
__forceinline const MD5Vector Step3(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) { #define Step2E(r, a, b, c, d, k) StepOuter(r, a, c, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), a)))
return StepOuter<r>(a, b, OP_ADD(Xor(b, c, d), OP_ADD(CREATE_VECTOR(k), a)));
}
template<int r, unsigned __int32 k> #define Step3(r, a, b, c, d, k, w) StepOuter(r, a, b, OP_ADD(Xor(b, c, d), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))))
__forceinline const MD5Vector Step4(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) { #define Step3E(r, a, b, c, d, k) StepOuter(r, a, b, OP_ADD(Xor(b, c, d), OP_ADD(CREATE_VECTOR(k), a)))
return StepOuter<r>(a, b, OP_ADD(I(c, b, d), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
}
template<int r, unsigned __int32 k> #define Step4(r, a, b, c, d, k, w) StepOuter(r, a, b, OP_ADD(I(c, b, d), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))))
__forceinline const MD5Vector Step4(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) { #define Step4E(r, a, b, c, d, k) StepOuter(r, a, b, OP_ADD(I(c, b, d), OP_ADD(CREATE_VECTOR(k), a)))
return StepOuter<r>(a, b, OP_ADD(I(c, b, d), OP_ADD(CREATE_VECTOR(k), a)));
}
void md5(unsigned __int32 * input, unsigned __int32 * expected) void md5(unsigned __int32 * input, unsigned __int32 * output)
{ {
MD5Vector a = CREATE_VECTOR(0x67452301); MD5Vector a = CREATE_VECTOR(Parameters.Init[0]);
MD5Vector b = CREATE_VECTOR(0xefcdab89); MD5Vector b = CREATE_VECTOR(Parameters.Init[1]);
MD5Vector c = CREATE_VECTOR(0x98badcfe); MD5Vector c = CREATE_VECTOR(Parameters.Init[2]);
MD5Vector d = CREATE_VECTOR(0x10325476); MD5Vector d = CREATE_VECTOR(Parameters.Init[3]);
MD5Vector inputVector0 = CREATE_VECTOR_FROM_INPUT(input, 0); MD5Vector inputVector0 = CREATE_VECTOR_FROM_INPUT(input, 0);
MD5Vector inputVector1 = CREATE_VECTOR_FROM_INPUT(input, 1); MD5Vector inputVector1 = CREATE_VECTOR_FROM_INPUT(input, 1);
@ -164,73 +195,73 @@ void md5(unsigned __int32 * input, unsigned __int32 * expected)
MD5Vector inputVector6 = CREATE_VECTOR_FROM_INPUT(input, 6); MD5Vector inputVector6 = CREATE_VECTOR_FROM_INPUT(input, 6);
MD5Vector inputVector7 = CREATE_VECTOR_FROM_INPUT(input, 7); MD5Vector inputVector7 = CREATE_VECTOR_FROM_INPUT(input, 7);
a = Step1< 7, 0xd76aa478>(a, b, c, d, inputVector0); a = Step1 ( 7, a, b, c, d, Parameters.K[ 0], inputVector0);
d = Step1<12, 0xe8c7b756>(d, a, b, c, inputVector1); d = Step1 (12, d, a, b, c, Parameters.K[ 1], inputVector1);
c = Step1<17, 0x242070db>(c, d, a, b, inputVector2); c = Step1 (17, c, d, a, b, Parameters.K[ 2], inputVector2);
b = Step1<22, 0xc1bdceee>(b, c, d, a, inputVector3); b = Step1 (22, b, c, d, a, Parameters.K[ 3], inputVector3);
a = Step1< 7, 0xf57c0faf>(a, b, c, d, inputVector4); a = Step1 ( 7, a, b, c, d, Parameters.K[ 4], inputVector4);
d = Step1<12, 0x4787c62a>(d, a, b, c, inputVector5); d = Step1 (12, d, a, b, c, Parameters.K[ 5], inputVector5);
c = Step1<17, 0xa8304613>(c, d, a, b, inputVector6); c = Step1 (17, c, d, a, b, Parameters.K[ 6], inputVector6);
b = Step1<22, 0xfd469501>(b, c, d, a); b = Step1E(22, b, c, d, a, Parameters.K[ 7]);
a = Step1< 7, 0x698098d8>(a, b, c, d); a = Step1E( 7, a, b, c, d, Parameters.K[ 8]);
d = Step1<12, 0x8b44f7af>(d, a, b, c); d = Step1E(12, d, a, b, c, Parameters.K[ 9]);
c = Step1<17, 0xffff5bb1>(c, d, a, b); c = Step1E(17, c, d, a, b, Parameters.K[10]);
b = Step1<22, 0x895cd7be>(b, c, d, a); b = Step1E(22, b, c, d, a, Parameters.K[11]);
a = Step1< 7, 0x6b901122>(a, b, c, d); a = Step1E( 7, a, b, c, d, Parameters.K[12]);
d = Step1<12, 0xfd987193>(d, a, b, c); d = Step1E(12, d, a, b, c, Parameters.K[13]);
c = Step1<17, 0xa679438e>(c, d, a, b, inputVector7); c = Step1 (17, c, d, a, b, Parameters.K[14], inputVector7);
b = Step1<22, 0x49b40821>(b, c, d, a); b = Step1E(22, b, c, d, a, Parameters.K[15]);
a = Step2< 5, 0xf61e2562>(a, d, b, c, inputVector1); a = Step2 ( 5, a, d, b, c, Parameters.K[16], inputVector1);
d = Step2< 9, 0xc040b340>(d, c, a, b, inputVector6); d = Step2 ( 9, d, c, a, b, Parameters.K[17], inputVector6);
c = Step2<14, 0x265e5a51>(c, b, d, a); c = Step2E(14, c, b, d, a, Parameters.K[18]);
b = Step2<20, 0xe9b6c7aa>(b, a, c, d, inputVector0); b = Step2 (20, b, a, c, d, Parameters.K[19], inputVector0);
a = Step2< 5, 0xd62f105d>(a, d, b, c, inputVector5); a = Step2 ( 5, a, d, b, c, Parameters.K[20], inputVector5);
d = Step2< 9, 0x02441453>(d, c, a, b); d = Step2E( 9, d, c, a, b, Parameters.K[21]);
c = Step2<14, 0xd8a1e681>(c, b, d, a); c = Step2E(14, c, b, d, a, Parameters.K[22]);
b = Step2<20, 0xe7d3fbc8>(b, a, c, d, inputVector4); b = Step2 (20, b, a, c, d, Parameters.K[23], inputVector4);
a = Step2< 5, 0x21e1cde6>(a, d, b, c); a = Step2E( 5, a, d, b, c, Parameters.K[24]);
d = Step2< 9, 0xc33707d6>(d, c, a, b, inputVector7); d = Step2 ( 9, d, c, a, b, Parameters.K[25], inputVector7);
c = Step2<14, 0xf4d50d87>(c, b, d, a, inputVector3); c = Step2 (14, c, b, d, a, Parameters.K[26], inputVector3);
b = Step2<20, 0x455a14ed>(b, a, c, d); b = Step2E(20, b, a, c, d, Parameters.K[27]);
a = Step2< 5, 0xa9e3e905>(a, d, b, c); a = Step2E( 5, a, d, b, c, Parameters.K[28]);
d = Step2< 9, 0xfcefa3f8>(d, c, a, b, inputVector2); d = Step2 ( 9, d, c, a, b, Parameters.K[29], inputVector2);
c = Step2<14, 0x676f02d9>(c, b, d, a); c = Step2E(14, c, b, d, a, Parameters.K[30]);
b = Step2<20, 0x8d2a4c8a>(b, a, c, d); b = Step2E(20, b, a, c, d, Parameters.K[31]);
a = Step3< 4, 0xfffa3942>(a, b, c, d, inputVector5); a = Step3 ( 4, a, b, c, d, Parameters.K[32], inputVector5);
d = Step3<11, 0x8771f681>(d, a, b, c); d = Step3E(11, d, a, b, c, Parameters.K[33]);
c = Step3<16, 0x6d9d6122>(c, d, a, b); c = Step3E(16, c, d, a, b, Parameters.K[34]);
b = Step3<23, 0xfde5380c>(b, c, d, a, inputVector7); b = Step3 (23, b, c, d, a, Parameters.K[35], inputVector7);
a = Step3< 4, 0xa4beea44>(a, b, c, d, inputVector1); a = Step3 ( 4, a, b, c, d, Parameters.K[36], inputVector1);
d = Step3<11, 0x4bdecfa9>(d, a, b, c, inputVector4); d = Step3 (11, d, a, b, c, Parameters.K[37], inputVector4);
c = Step3<16, 0xf6bb4b60>(c, d, a, b); c = Step3E(16, c, d, a, b, Parameters.K[38]);
b = Step3<23, 0xbebfbc70>(b, c, d, a); b = Step3E(23, b, c, d, a, Parameters.K[39]);
a = Step3< 4, 0x289b7ec6>(a, b, c, d); a = Step3E( 4, a, b, c, d, Parameters.K[40]);
d = Step3<11, 0xeaa127fa>(d, a, b, c, inputVector0); d = Step3 (11, d, a, b, c, Parameters.K[41], inputVector0);
c = Step3<16, 0xd4ef3085>(c, d, a, b, inputVector3); c = Step3 (16, c, d, a, b, Parameters.K[42], inputVector3);
b = Step3<23, 0x04881d05>(b, c, d, a, inputVector6); b = Step3 (23, b, c, d, a, Parameters.K[43], inputVector6);
a = Step3< 4, 0xd9d4d039>(a, b, c, d); a = Step3E( 4, a, b, c, d, Parameters.K[44]);
d = Step3<11, 0xe6db99e5>(d, a, b, c); d = Step3E(11, d, a, b, c, Parameters.K[45]);
c = Step3<16, 0x1fa27cf8>(c, d, a, b); c = Step3E(16, c, d, a, b, Parameters.K[46]);
b = Step3<23, 0xc4ac5665>(b, c, d, a, inputVector2); b = Step3 (23, b, c, d, a, Parameters.K[47], inputVector2);
a = Step4< 6, 0xf4292244>(a, b, c, d, inputVector0); a = Step4 ( 6, a, b, c, d, Parameters.K[48], inputVector0);
d = Step4<10, 0x432aff97>(d, a, b, c); d = Step4E(10, d, a, b, c, Parameters.K[49]);
c = Step4<15, 0xab9423a7>(c, d, a, b, inputVector7); c = Step4 (15, c, d, a, b, Parameters.K[50], inputVector7);
b = Step4<21, 0xfc93a039>(b, c, d, a, inputVector5); b = Step4 (21, b, c, d, a, Parameters.K[51], inputVector5);
a = Step4< 6, 0x655b59c3>(a, b, c, d); a = Step4E( 6, a, b, c, d, Parameters.K[52]);
d = Step4<10, 0x8f0ccc92>(d, a, b, c, inputVector3); d = Step4 (10, d, a, b, c, Parameters.K[53], inputVector3);
c = Step4<15, 0xffeff47d>(c, d, a, b); c = Step4E(15, c, d, a, b, Parameters.K[54]);
b = Step4<21, 0x85845dd1>(b, c, d, a, inputVector1); b = Step4 (21, b, c, d, a, Parameters.K[55], inputVector1);
a = Step4< 6, 0x6fa87e4f>(a, b, c, d); a = Step4E( 6, a, b, c, d, Parameters.K[56]);
d = Step4<10, 0xfe2ce6e0>(d, a, b, c); d = Step4E(10, d, a, b, c, Parameters.K[57]);
c = Step4<15, 0xa3014314>(c, d, a, b, inputVector6); c = Step4 (15, c, d, a, b, Parameters.K[58], inputVector6);
b = Step4<21, 0x4e0811a1>(b, c, d, a); b = Step4E(21, b, c, d, a, Parameters.K[59]);
a = Step4< 6, 0xf7537e82>(a, b, c, d, inputVector4); a = Step4 ( 6, a, b, c, d, Parameters.K[60], inputVector4);
a = OP_ADD(CREATE_VECTOR(0x67452301), a); a = OP_ADD(CREATE_VECTOR(Parameters.Init[0]), a);
WRITE_TO_OUTPUT(a, ((__int32*)input), ((__m256i*)expected)); WRITE_TO_OUTPUT(a, output);
} }
#pragma managed #pragma managed

@ -1,3 +1,3 @@
#pragma once #pragma once
void md5(unsigned int* input, unsigned __int32 * expected); void md5(unsigned int* input, unsigned int* output);

@ -1,86 +0,0 @@
#include "stdafx.h"
#include "phraseset.h"
#include "constants.h"
#include "intrin.h"
#pragma unmanaged
template<int numberOfWords>
class Processor
{
public:
template<int wordNumber>
static __forceinline const __m256i ProcessWord(const __m256i phrase, const unsigned __int64 cumulativeWordOffset, const unsigned __int64 permutation, unsigned __int64* allWordsPointer, __int32* wordIndexes)
{
auto currentWord = allWordsPointer + wordIndexes[_bextr_u64(permutation, 4 * wordNumber, 4)] * 128;
return ProcessWord<wordNumber + 1>(
_mm256_xor_si256(phrase, *(__m256i*)(currentWord + cumulativeWordOffset)),
cumulativeWordOffset + currentWord[127],
permutation,
allWordsPointer,
wordIndexes);
}
template<>
static __forceinline const __m256i ProcessWord<numberOfWords>(const __m256i phrase, const unsigned __int64 cumulativeWordOffset, const unsigned __int64 permutation, unsigned __int64* allWordsPointer, __int32* wordIndexes)
{
return phrase;
}
template<int phraseNumber>
static __forceinline void ProcessWordsForPhrase(__m256i* avx2initialBuffer, __m256i* avx2buffer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer)
{
avx2buffer[phraseNumber] = ProcessWord<0>(*avx2initialBuffer, 0, permutationsPointer[phraseNumber], allWordsPointer, wordIndexes);
ProcessWordsForPhrase<phraseNumber + 1>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
}
template<>
static __forceinline void ProcessWordsForPhrase<PHRASES_PER_SET>(__m256i* avx2initialBuffer, __m256i* avx2buffer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer)
{
return;
}
};
void fillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords)
{
auto avx2initialBuffer = (__m256i*)initialBufferPointer;
auto avx2buffer = (__m256i*)bufferPointer;
switch (numberOfWords)
{
case 1:
Processor<1>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 2:
Processor<2>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 3:
Processor<3>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 4:
Processor<4>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 5:
Processor<5>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 6:
Processor<6>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 7:
Processor<7>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 8:
Processor<8>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 9:
Processor<9>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 10:
Processor<10>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
}
}
#pragma managed

@ -1,3 +0,0 @@
#pragma once
void fillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords);

@ -1,7 +1,7 @@
<?xml version="1.0" encoding="utf-8"?> <?xml version="1.0" encoding="utf-8" ?>
<configuration> <configuration>
<startup> <startup>
<supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.7"/> <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.6" />
</startup> </startup>
<appSettings> <appSettings>
<add key="SourcePhrase" value="poultry outwits ants" /> <add key="SourcePhrase" value="poultry outwits ants" />

@ -2,10 +2,6 @@
{ {
internal class Constants internal class Constants
{ {
public const int PhrasesPerSet = WhiteRabbitUnmanagedBridge.MD5Unmanaged.PhrasesPerSet; public const int PhrasesPerSet = 8;
public const int MaxNumberOfWords = 8;
public const int NumberOfThreads = 4;
} }
} }

@ -1,6 +1,5 @@
namespace WhiteRabbit namespace WhiteRabbit
{ {
using System;
using System.Collections.Generic; using System.Collections.Generic;
using System.Collections.Immutable; using System.Collections.Immutable;
using System.Linq; using System.Linq;
@ -10,11 +9,28 @@
/// </summary> /// </summary>
internal static class Flattener internal static class Flattener
{ {
// Slow universal implementation
private static IEnumerable<ImmutableStack<T>> FlattenAny<T>(ImmutableStack<T[]> phrase)
{
if (phrase.IsEmpty)
{
return new[] { ImmutableStack.Create<T>() };
}
T[] wordVariants;
var newStack = phrase.Pop(out wordVariants);
return FlattenAny(newStack).SelectMany(remainder => wordVariants.Select(word => remainder.Push(word)));
}
// Fast hard-coded implementation for 3 words
private static IEnumerable<T[]> Flatten3<T>(T[][] phrase) private static IEnumerable<T[]> Flatten3<T>(T[][] phrase)
{ {
foreach (var item0 in phrase[0]) foreach (var item0 in phrase[0])
{
foreach (var item1 in phrase[1]) foreach (var item1 in phrase[1])
{
foreach (var item2 in phrase[2]) foreach (var item2 in phrase[2])
{
yield return new T[] yield return new T[]
{ {
item0, item0,
@ -22,13 +38,21 @@
item2, item2,
}; };
} }
}
}
}
// Fast hard-coded implementation for 4 words
private static IEnumerable<T[]> Flatten4<T>(T[][] phrase) private static IEnumerable<T[]> Flatten4<T>(T[][] phrase)
{ {
foreach (var item0 in phrase[0]) foreach (var item0 in phrase[0])
{
foreach (var item1 in phrase[1]) foreach (var item1 in phrase[1])
{
foreach (var item2 in phrase[2]) foreach (var item2 in phrase[2])
{
foreach (var item3 in phrase[3]) foreach (var item3 in phrase[3])
{
yield return new T[] yield return new T[]
{ {
item0, item0,
@ -37,14 +61,24 @@
item3, item3,
}; };
} }
}
}
}
}
// Fast hard-coded implementation for 5 words
private static IEnumerable<T[]> Flatten5<T>(T[][] phrase) private static IEnumerable<T[]> Flatten5<T>(T[][] phrase)
{ {
foreach (var item0 in phrase[0]) foreach (var item0 in phrase[0])
{
foreach (var item1 in phrase[1]) foreach (var item1 in phrase[1])
{
foreach (var item2 in phrase[2]) foreach (var item2 in phrase[2])
{
foreach (var item3 in phrase[3]) foreach (var item3 in phrase[3])
{
foreach (var item4 in phrase[4]) foreach (var item4 in phrase[4])
{
yield return new T[] yield return new T[]
{ {
item0, item0,
@ -54,15 +88,27 @@
item4, item4,
}; };
} }
}
}
}
}
}
// Fast hard-coded implementation for 6 words
private static IEnumerable<T[]> Flatten6<T>(T[][] phrase) private static IEnumerable<T[]> Flatten6<T>(T[][] phrase)
{ {
foreach (var item0 in phrase[0]) foreach (var item0 in phrase[0])
{
foreach (var item1 in phrase[1]) foreach (var item1 in phrase[1])
{
foreach (var item2 in phrase[2]) foreach (var item2 in phrase[2])
{
foreach (var item3 in phrase[3]) foreach (var item3 in phrase[3])
{
foreach (var item4 in phrase[4]) foreach (var item4 in phrase[4])
{
foreach (var item5 in phrase[5]) foreach (var item5 in phrase[5])
{
yield return new T[] yield return new T[]
{ {
item0, item0,
@ -73,88 +119,30 @@
item5, item5,
}; };
} }
}
}
}
}
}
}
// Fast hard-coded implementation for 7 words
private static IEnumerable<T[]> Flatten7<T>(T[][] phrase) private static IEnumerable<T[]> Flatten7<T>(T[][] phrase)
{ {
foreach (var item0 in phrase[0]) foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
foreach (var item5 in phrase[5])
foreach (var item6 in phrase[6])
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
item5,
item6,
};
}
private static IEnumerable<T[]> Flatten8<T>(T[][] phrase)
{ {
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1]) foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
foreach (var item5 in phrase[5])
foreach (var item6 in phrase[6])
foreach (var item7 in phrase[7])
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
item5,
item6,
item7,
};
}
private static IEnumerable<T[]> Flatten9<T>(T[][] phrase)
{ {
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2]) foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
foreach (var item5 in phrase[5])
foreach (var item6 in phrase[6])
foreach (var item7 in phrase[7])
foreach (var item8 in phrase[8])
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
item5,
item6,
item7,
item8,
};
}
private static IEnumerable<T[]> Flatten10<T>(T[][] phrase)
{ {
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3]) foreach (var item3 in phrase[3])
{
foreach (var item4 in phrase[4]) foreach (var item4 in phrase[4])
{
foreach (var item5 in phrase[5]) foreach (var item5 in phrase[5])
{
foreach (var item6 in phrase[6]) foreach (var item6 in phrase[6])
foreach (var item7 in phrase[7]) {
foreach (var item8 in phrase[8])
foreach (var item9 in phrase[9])
yield return new T[] yield return new T[]
{ {
item0, item0,
@ -164,11 +152,15 @@
item4, item4,
item5, item5,
item6, item6,
item7,
item8,
item9,
}; };
} }
}
}
}
}
}
}
}
public static IEnumerable<T[]> Flatten<T>(T[][] wordVariants) public static IEnumerable<T[]> Flatten<T>(T[][] wordVariants)
{ {
@ -184,14 +176,8 @@
return Flatten6(wordVariants); return Flatten6(wordVariants);
case 7: case 7:
return Flatten7(wordVariants); return Flatten7(wordVariants);
case 8:
return Flatten8(wordVariants);
case 9:
return Flatten9(wordVariants);
case 10:
return Flatten10(wordVariants);
default: default:
throw new ArgumentOutOfRangeException(nameof(wordVariants)); return FlattenAny(ImmutableStack.Create(wordVariants)).Select(words => words.ToArray());
} }
} }
} }

@ -0,0 +1,21 @@
namespace WhiteRabbit
{
using System.Runtime.CompilerServices;
using WhiteRabbitUnmanagedBridge;
internal static class MD5Digest
{
// It only returns first component of MD5 hash
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe uint[] Compute(PhraseSet input)
{
var result = new uint[Constants.PhrasesPerSet];
fixed (uint* resultPointer = result)
{
MD5Unmanaged.ComputeMD5(input.Buffer, resultPointer);
}
return result;
}
}
}

@ -1,107 +1,53 @@
namespace WhiteRabbit namespace WhiteRabbit
{ {
using System;
using System.Diagnostics;
using System.Linq;
using System.Numerics;
using System.Runtime.CompilerServices;
using WhiteRabbitUnmanagedBridge;
// Anagram representation optimized for MD5 // Anagram representation optimized for MD5
internal struct PhraseSet internal unsafe struct PhraseSet
{ {
private uint[] Buffer; public fixed uint Buffer[8 * Constants.PhrasesPerSet];
public void Init() public PhraseSet(byte[][] words, int[][] permutations, int offset, int numberOfCharacters)
{
this.Buffer = new uint[8 * Constants.PhrasesPerSet];
}
public unsafe void FillLength(int numberOfCharacters, int numberOfWords)
{ {
fixed (uint* bufferPointer = this.Buffer) fixed (uint* bufferPointer = this.Buffer)
{ {
var length = (uint)(numberOfCharacters + numberOfWords - 1); var length = numberOfCharacters + words.Length - 1;
var lengthInBits = (uint)(length << 3);
for (var i = 0; i < Constants.PhrasesPerSet; i++) for (var i = 0; i < Constants.PhrasesPerSet; i++)
{ {
bufferPointer[7 + i * 8] = lengthInBits; var permutation = permutations[offset + i];
((byte*)bufferPointer)[length + i * 32] = 128 ^ ' '; var startPointer = bufferPointer + i * 8;
} byte[] currentWord;
} var j = 0;
} var wordIndex = 0;
var currentPointer = (byte*)startPointer;
byte* lastPointer = currentPointer + length;
public unsafe void ProcessPermutations(PhraseSet initialPhraseSet, Word[] allWords, int[] wordIndexes, ulong[] permutations, uint[] expectedHashesVector, Action<byte[], uint> action) for (; currentPointer < lastPointer; currentPointer++)
{
fixed (uint* bufferPointer = this.Buffer, initialBufferPointer = initialPhraseSet.Buffer)
{ {
fixed (ulong* permutationsPointer = permutations) currentWord = words[permutation[wordIndex]];
{ *currentPointer = currentWord[j];
fixed (int* wordIndexesPointer = wordIndexes)
{
fixed (Word* allWordsPointer = allWords)
{
fixed (uint* expectedHashesPointer = expectedHashesVector)
{
for (var i = 0; i < permutations.Length; i += Constants.PhrasesPerSet)
{
MD5Unmanaged.FillPhraseSet(
(ulong*)initialBufferPointer,
(ulong*)bufferPointer,
(ulong*)allWordsPointer,
wordIndexesPointer,
permutationsPointer + i,
wordIndexes.Length);
MD5Unmanaged.ComputeMD5(bufferPointer, expectedHashesPointer); j++;
if (bufferPointer[Constants.PhrasesPerSet / 2] != 0xFFFFFFFF) // 0xffffffff if greater than or equal, 0 if less than
{ var comparisonResult = unchecked((int)((((uint)j - (uint)currentWord.Length) >> 31) - 1));
for (var j = 0; j < Constants.PhrasesPerSet; j++) j = (comparisonResult & 0) | (~comparisonResult & j);
{ wordIndex = (comparisonResult & (wordIndex + 1)) | (~comparisonResult & wordIndex);
// 16 matches are packed in 8 32-bit numbers: [0,1], [8,9], [2,3], [10,11], [4, 5], [12, 13], [6, 7], [14, 15]
var position = ((j / 2) % 4) * 2 + (j / 8);
var match = (bufferPointer[position] >> (4 * (j % 2))) & 0xF0F0F0F;
if (match != 0)
{
var bufferInfo = ((ulong)bufferPointer[Constants.PhrasesPerSet] << 32) | bufferPointer[j];
MD5Unmanaged.FillPhraseSet(
(ulong*)initialBufferPointer,
(ulong*)bufferPointer,
(ulong*)allWordsPointer,
wordIndexesPointer,
permutationsPointer + i,
wordIndexes.Length);
action(this.GetBytes(j), match);
break;
}
}
}
}
}
}
} }
*currentPointer = 128;
startPointer[7] = (uint)(length << 3);
} }
} }
} }
public unsafe byte[] GetBytes(int number) public byte[] GetBytes(int number)
{ {
Debug.Assert(number < Constants.PhrasesPerSet); System.Diagnostics.Debug.Assert(number < Constants.PhrasesPerSet);
fixed(uint* bufferPointer = this.Buffer) fixed(uint* bufferPointer = this.Buffer)
{ {
var phrasePointer = bufferPointer + 8 * number; var phrasePointer = bufferPointer + 8 * number;
var length = 0; var length = phrasePointer[7] >> 3;
for (var i = 27; i >= 0; i--)
{
if (((byte*)phrasePointer)[i] == 128)
{
length = i;
break;
}
}
var result = new byte[length]; var result = new byte[length];
for (var i = 0; i < length; i++) for (var i = 0; i < length; i++)
{ {
@ -111,16 +57,5 @@
return result; return result;
} }
} }
public unsafe string DebugBytes(int number)
{
Debug.Assert(number < Constants.PhrasesPerSet);
fixed (uint* bufferPointer = this.Buffer)
{
var bytes = (byte*)bufferPointer;
return string.Concat(Enumerable.Range(32 * number, 32).Select(i => bytes[i].ToString("X2")));
}
}
} }
} }

@ -1,137 +1,29 @@
namespace WhiteRabbit namespace WhiteRabbit
{ {
using System;
using System.Collections.Generic; using System.Collections.Generic;
using System.Linq; using System.Linq;
internal static class PrecomputedPermutationsGenerator internal static class PrecomputedPermutationsGenerator
{ {
static PrecomputedPermutationsGenerator() private static int[][][] Permutations { get; } = Enumerable.Range(0, 8).Select(GeneratePermutations).ToArray();
{
Permutations = new ulong[Constants.MaxNumberOfWords + 1][][];
PermutationsNumbers = new long[Constants.MaxNumberOfWords + 1][];
for (var i = 0; i <= Constants.MaxNumberOfWords; i++)
{
var permutationsInfo = GeneratePermutations(i);
Permutations[i] = permutationsInfo.Item1;
PermutationsNumbers[i] = permutationsInfo.Item2;
}
}
private static ulong[][][] Permutations { get; } private static long[] PermutationsNumbers { get; } = GeneratePermutationsNumbers().Take(19).ToArray();
private static long[][] PermutationsNumbers { get; } public static int[][] HamiltonianPermutations(int n) => Permutations[n];
public static ulong[] HamiltonianPermutations(int n, uint filter) => Permutations[n][filter]; public static long GetPermutationsNumber(int n) => PermutationsNumbers[n];
public static long GetPermutationsNumber(int n, uint filter) => PermutationsNumbers[n][filter]; private static int[][] GeneratePermutations(int n)
private static Tuple<ulong[][], long[]> GeneratePermutations(int n)
{
if (n == 0)
{
return Tuple.Create(new ulong[0][], new long[0]);
}
var allPermutations = PermutationsGenerator.HamiltonianPermutations(n)
.Select(FormatPermutation)
.ToArray();
var statesCount = (uint)1 << (n - 1);
var resultUnpadded = new PermutationInfo[statesCount][];
resultUnpadded[0] = allPermutations;
for (uint i = 1; i < statesCount; i++)
{
var mask = i;
mask |= mask >> 1;
mask |= mask >> 2;
mask |= mask >> 4;
mask |= mask >> 8;
mask |= mask >> 16;
mask = mask >> 1;
var existing = i & mask;
var seniorBit = i ^ existing;
var position = 0;
while (seniorBit != 0)
{ {
seniorBit = seniorBit >> 1; var result = PermutationsGenerator.HamiltonianPermutations(n)
position++; .Select(permutation => permutation.PermutationData)
}
resultUnpadded[i] = resultUnpadded[existing]
.Where(info => ((info.PermutationInverse >> (4 * (position - 1))) % 16 < (info.PermutationInverse >> (4 * position)) % 16))
.ToArray(); .ToArray();
} if (result.Length % Constants.PhrasesPerSet == 0)
var result = new ulong[statesCount][];
var numbers = new long[statesCount];
for (uint i = 0; i < statesCount; i++)
{ {
result[i] = PadToWholeChunks(resultUnpadded[i], Constants.PhrasesPerSet);
numbers[i] = resultUnpadded[i].LongLength;
}
return Tuple.Create(result, numbers);
}
public static bool IsOrderPreserved(ulong permutation, uint position)
{
var currentPermutation = permutation;
while (currentPermutation != 0)
{
if ((currentPermutation & 15) == position)
{
return true;
}
if ((currentPermutation & 15) == (position + 1))
{
return false;
}
currentPermutation = currentPermutation >> 4;
}
throw new ApplicationException("Malformed permutation " + permutation + " for position " + position);
}
private static ulong[] PadToWholeChunks(PermutationInfo[] original, int chunkSize)
{
ulong[] result;
if (original.Length % chunkSize == 0)
{
result = new ulong[original.Length];
}
else
{
result = new ulong[original.Length + chunkSize - (original.Length % chunkSize)];
}
for (var i = 0; i < original.Length; i++)
{
result[i] = original[i].Permutation;
}
return result; return result;
} }
private static PermutationInfo FormatPermutation(PermutationsGenerator.Permutation permutation) return result.Concat(Enumerable.Repeat(result[0], Constants.PhrasesPerSet - (result.Length % Constants.PhrasesPerSet))).ToArray();
{
System.Diagnostics.Debug.Assert(permutation.PermutationData.Length <= 16);
ulong result = 0;
ulong resultInverse = 0;
for (var i = 0; i < permutation.PermutationData.Length; i++)
{
var source = i;
var target = permutation.PermutationData[i];
result |= (ulong)(target) << (4 * source);
resultInverse |= (ulong)(source) << (4 * target);
}
return new PermutationInfo { Permutation = result, PermutationInverse = resultInverse };
} }
private static IEnumerable<long> GeneratePermutationsNumbers() private static IEnumerable<long> GeneratePermutationsNumbers()
@ -147,11 +39,5 @@
i++; i++;
} }
} }
private struct PermutationInfo
{
public ulong Permutation;
public ulong PermutationInverse;
}
} }
} }

@ -1,10 +1,10 @@
namespace WhiteRabbit namespace WhiteRabbit
{ {
using System; using System;
using System.Collections.Concurrent;
using System.Collections.Generic; using System.Collections.Generic;
using System.Configuration; using System.Configuration;
using System.Diagnostics; using System.Diagnostics;
using System.IO;
using System.Linq; using System.Linq;
using System.Numerics; using System.Numerics;
using System.Security.Cryptography; using System.Security.Cryptography;
@ -24,18 +24,13 @@
stopwatch.Start(); stopwatch.Start();
var sourcePhrase = ConfigurationManager.AppSettings["SourcePhrase"]; var sourcePhrase = ConfigurationManager.AppSettings["SourcePhrase"];
var sourceChars = ToOrderedChars(sourcePhrase);
var maxWordsInPhrase = int.Parse(ConfigurationManager.AppSettings["MaxWordsInPhrase"]); var maxWordsInPhrase = int.Parse(ConfigurationManager.AppSettings["MaxWordsInPhrase"]);
if (sourcePhrase.Where(ch => ch != ' ').Count() + maxWordsInPhrase > 28) if (sourceChars.Length + maxWordsInPhrase > 27)
{ {
Console.WriteLine("Only anagrams of up to 27 characters (including whitespace) are allowed"); Console.WriteLine("Only anagrams of up to 27 characters are allowed");
return;
}
if (maxWordsInPhrase > Constants.MaxNumberOfWords)
{
Console.WriteLine($"Only anagrams of up to {Constants.MaxNumberOfWords} words are allowed");
return; return;
} }
@ -50,16 +45,12 @@
Console.WriteLine("Only 64-bit systems are supported due to MD5Digest optimizations"); Console.WriteLine("Only 64-bit systems are supported due to MD5Digest optimizations");
} }
var expectedHashesFirstComponentsArray = new uint[8]; var expectedHashesAsVectors = ConfigurationManager.AppSettings["ExpectedHashes"]
{ .Split(',')
int i = 0; .Select(hash => new Vector<uint>(HexadecimalStringToUnsignedIntArray(hash)))
foreach (var expectedHash in ConfigurationManager.AppSettings["ExpectedHashes"].Split(',')) .ToArray();
{
expectedHashesFirstComponentsArray[i] = HexadecimalStringToUnsignedIntArray(expectedHash)[0]; var expectedHashesFirstComponents = expectedHashesAsVectors.Select(vector => vector[0]).ToArray();
expectedHashesFirstComponentsArray[i + 1] = HexadecimalStringToUnsignedIntArray(expectedHash)[0];
i += 2;
}
}
var processor = new StringsProcessor( var processor = new StringsProcessor(
Encoding.ASCII.GetBytes(sourcePhrase), Encoding.ASCII.GetBytes(sourcePhrase),
@ -75,14 +66,27 @@
stopwatch.Restart(); stopwatch.Restart();
processor.CheckPhrases(expectedHashesFirstComponentsArray, (phraseBytes, hashFirstComponent) => processor.GeneratePhrases()
.ForAll(phraseSet =>
{
var hashesFirstComponents = MD5Digest.Compute(phraseSet);
for (var i = 0; i < Constants.PhrasesPerSet; i++)
{ {
var phrase = Encoding.ASCII.GetString(phraseBytes); Debug.Assert(
var hash = ComputeFullMD5(phraseBytes); sourceChars == ToOrderedChars(ToString(phraseSet, i)),
Console.WriteLine($"Found phrase for {hash} ({hashFirstComponent:x8}): {phrase}; time from start is {stopwatch.Elapsed}"); $"StringsProcessor produced incorrect anagram: {ToString(phraseSet, i)}");
if (Array.IndexOf(expectedHashesFirstComponents, hashesFirstComponents[i]) >= 0)
{
var phrase = ToString(phraseSet, i);
var hash = ComputeFullMD5(phrase);
Console.WriteLine($"Found phrase for {hash}: {phrase}; time from start is {stopwatch.Elapsed}");
}
}
}); });
Console.WriteLine($"Done; time from start: {stopwatch.Elapsed}"); Console.WriteLine($"Done; time from start: {stopwatch.Elapsed}");
} }
// Code taken from http://stackoverflow.com/a/321404/831314 // Code taken from http://stackoverflow.com/a/321404/831314
@ -96,8 +100,9 @@
} }
// We can afford to spend some time here; this code will only run for matched phrases (and for one in several billion non-matched) // We can afford to spend some time here; this code will only run for matched phrases (and for one in several billion non-matched)
private static string ComputeFullMD5(byte[] phraseBytes) private static string ComputeFullMD5(string phrase)
{ {
var phraseBytes = Encoding.ASCII.GetBytes(phrase);
using (var hashAlgorithm = new MD5CryptoServiceProvider()) using (var hashAlgorithm = new MD5CryptoServiceProvider())
{ {
var resultBytes = hashAlgorithm.ComputeHash(phraseBytes); var resultBytes = hashAlgorithm.ComputeHash(phraseBytes);
@ -110,6 +115,11 @@
return hex.Substring(6, 2) + hex.Substring(4, 2) + hex.Substring(2, 2) + hex.Substring(0, 2); return hex.Substring(6, 2) + hex.Substring(4, 2) + hex.Substring(2, 2) + hex.Substring(0, 2);
} }
private static string ToString(PhraseSet phrase, int offset)
{
return Encoding.ASCII.GetString(phrase.GetBytes(offset));
}
private static IEnumerable<byte[]> ReadInput() private static IEnumerable<byte[]> ReadInput()
{ {
string line; string line;
@ -118,5 +128,20 @@
yield return Encoding.ASCII.GetBytes(line); yield return Encoding.ASCII.GetBytes(line);
} }
} }
private static string ToOrderedChars(string source)
{
return new string(source.Where(ch => ch != ' ').OrderBy(ch => ch).ToArray());
}
#if SINGLE_THREADED
private static void ForAll<T>(this IEnumerable<T> source, Action<T> action)
{
foreach (var entry in source)
{
action(entry);
}
}
#endif
} }
} }

@ -3,8 +3,6 @@
using System; using System;
using System.Collections.Generic; using System.Collections.Generic;
using System.Linq; using System.Linq;
using System.Numerics;
using System.Threading.Tasks;
internal sealed class StringsProcessor internal sealed class StringsProcessor
{ {
@ -13,7 +11,7 @@
// Ensure that permutations are precomputed prior to main run, so that processing times will be correct // Ensure that permutations are precomputed prior to main run, so that processing times will be correct
static StringsProcessor() static StringsProcessor()
{ {
PrecomputedPermutationsGenerator.HamiltonianPermutations(1, 0); PrecomputedPermutationsGenerator.HamiltonianPermutations(0);
} }
public StringsProcessor(byte[] sourceString, int maxWordsCount, IEnumerable<byte[]> words) public StringsProcessor(byte[] sourceString, int maxWordsCount, IEnumerable<byte[]> words)
@ -22,26 +20,18 @@
this.NumberOfCharacters = filteredSource.Length; this.NumberOfCharacters = filteredSource.Length;
this.VectorsConverter = new VectorsConverter(filteredSource); this.VectorsConverter = new VectorsConverter(filteredSource);
var allWordsAndVectors = words // Dictionary of vectors to array of words represented by this vector
var vectorsToWords = words
.Where(word => word != null && word.Length > 0) .Where(word => word != null && word.Length > 0)
.Select(word => new { word, vector = this.VectorsConverter.GetVector(word) }) .Select(word => new { word = word.Concat(new byte[] { SPACE }).ToArray(), vector = this.VectorsConverter.GetVector(word) })
.Where(tuple => tuple.vector != null) .Where(tuple => tuple.vector != null)
.Select(tuple => tuple.word) .Select(tuple => new { tuple.word, vector = tuple.vector.Value })
.Distinct(new ByteArrayEqualityComparer())
.Select(word => word)
.ToArray();
// Dictionary of vectors to array of words represented by this vector
var vectorsToWords = allWordsAndVectors
.Select((word, index) => new { word, index, vector = this.VectorsConverter.GetVector(word).Value })
.GroupBy(tuple => tuple.vector) .GroupBy(tuple => tuple.vector)
.Select(group => new { vector = group.Key, words = group.Select(tuple => tuple.index).ToArray() }) .Select(group => new { vector = group.Key, words = group.Select(tuple => tuple.word).Distinct(new ByteArrayEqualityComparer()).ToArray() })
.ToList(); .ToList();
this.WordsDictionary = vectorsToWords.Select(tuple => tuple.words).ToArray(); this.WordsDictionary = vectorsToWords.Select(tuple => tuple.words).ToArray();
this.AllWords = allWordsAndVectors.Select(word => new Word(word)).ToArray();
this.VectorsProcessor = new VectorsProcessor( this.VectorsProcessor = new VectorsProcessor(
this.VectorsConverter.GetVector(filteredSource).Value, this.VectorsConverter.GetVector(filteredSource).Value,
maxWordsCount, maxWordsCount,
@ -50,56 +40,42 @@
private VectorsConverter VectorsConverter { get; } private VectorsConverter VectorsConverter { get; }
private Word[] AllWords { get; }
/// <summary> /// <summary>
/// WordsDictionary[vectorIndex] = [word1index, word2index, ...] /// WordsDictionary[vectorIndex] = [word1, word2, ...]
/// </summary> /// </summary>
private int[][] WordsDictionary { get; } private byte[][][] WordsDictionary { get; }
private VectorsProcessor VectorsProcessor { get; } private VectorsProcessor VectorsProcessor { get; }
private int NumberOfCharacters { get; } private int NumberOfCharacters { get; }
public void CheckPhrases(uint[] expectedHashesVector, Action<byte[], uint> action) #if SINGLE_THREADED
public IEnumerable<PhraseSet> GeneratePhrases()
#else
public ParallelQuery<PhraseSet> GeneratePhrases()
#endif
{ {
// task of finding anagrams could be reduced to the task of finding sequences of dictionary vectors with the target sum // task of finding anagrams could be reduced to the task of finding sequences of dictionary vectors with the target sum
var sums = this.VectorsProcessor.GenerateSequences(); var sums = this.VectorsProcessor.GenerateSequences();
// converting sequences of vectors to the sequences of words... // converting sequences of vectors to the sequences of words...
Parallel.ForEach(sums, new ParallelOptions { MaxDegreeOfParallelism = Constants.NumberOfThreads }, sum => ProcessSum(sum, expectedHashesVector, action)); return sums
.Select(this.ConvertVectorsToWords)
.SelectMany(Flattener.Flatten)
.SelectMany(this.ConvertWordsToPhrases);
} }
public long GetPhrasesCount() public long GetPhrasesCount()
{ {
var sums = this.VectorsProcessor.GenerateSequences(); return this.VectorsProcessor.GenerateSequences()
return (from sum in sums .Select(this.ConvertVectorsToWordsNumber)
let filter = ComputeFilter(sum) .Sum(tuple => tuple.Item2 * PrecomputedPermutationsGenerator.GetPermutationsNumber(tuple.Item1));
let wordsVariantsNumber = this.ConvertVectorsToWordsNumber(sum)
let permutationsNumber = PrecomputedPermutationsGenerator.GetPermutationsNumber(sum.Length, filter)
let total = wordsVariantsNumber * permutationsNumber
select total)
.Sum();
}
private static uint ComputeFilter(int[] vectors)
{
uint result = 0;
for (var i = 1; i < vectors.Length; i++)
{
if (vectors[i] == vectors[i - 1])
{
result |= (uint)1 << (i - 1);
}
}
return result;
} }
private int[][] ConvertVectorsToWordIndexes(int[] vectors) private byte[][][] ConvertVectorsToWords(int[] vectors)
{ {
var length = vectors.Length; var length = vectors.Length;
var words = new int[length][]; var words = new byte[length][][];
for (var i = 0; i < length; i++) for (var i = 0; i < length; i++)
{ {
words[i] = this.WordsDictionary[vectors[i]]; words[i] = this.WordsDictionary[vectors[i]];
@ -108,7 +84,7 @@
return words; return words;
} }
private long ConvertVectorsToWordsNumber(int[] vectors) private Tuple<int, long> ConvertVectorsToWordsNumber(int[] vectors)
{ {
long result = 1; long result = 1;
for (var i = 0; i < vectors.Length; i++) for (var i = 0; i < vectors.Length; i++)
@ -116,27 +92,16 @@
result *= this.WordsDictionary[vectors[i]].Length; result *= this.WordsDictionary[vectors[i]].Length;
} }
return result; return Tuple.Create(vectors.Length, result);
} }
private void ProcessSum(int[] sum, uint[] expectedHashesVector, Action<byte[], uint> action) private IEnumerable<PhraseSet> ConvertWordsToPhrases(byte[][] words)
{ {
var initialPhraseSet = new PhraseSet(); var permutations = PrecomputedPermutationsGenerator.HamiltonianPermutations(words.Length);
initialPhraseSet.Init(); var permutationsLength = permutations.Length;
initialPhraseSet.FillLength(this.NumberOfCharacters, sum.Length); for (var i = 0; i < permutationsLength; i += Constants.PhrasesPerSet)
var phraseSet = new PhraseSet();
phraseSet.Init();
var permutationsFilter = ComputeFilter(sum);
var wordsVariants = this.ConvertVectorsToWordIndexes(sum);
foreach (var wordsArray in Flattener.Flatten(wordsVariants))
{ {
phraseSet.ProcessPermutations( yield return new PhraseSet(words, permutations, i, this.NumberOfCharacters);
initialPhraseSet,
this.AllWords,
wordsArray,
PrecomputedPermutationsGenerator.HamiltonianPermutations(wordsArray.Length, permutationsFilter),
expectedHashesVector,
action);
} }
} }
} }

@ -48,9 +48,16 @@
private ImmutableArray<int> NormsIndex { get; } private ImmutableArray<int> NormsIndex { get; }
// Produces all sets of vectors with the target sum // Produces all sets of vectors with the target sum
#if SINGLE_THREADED
public IEnumerable<int[]> GenerateSequences() public IEnumerable<int[]> GenerateSequences()
#else
public ParallelQuery<int[]> GenerateSequences()
#endif
{ {
return this.GenerateUnorderedSequences(this.Target, GetVectorNorm(this.Target, this.Target), this.MaxVectorsCount, 0) return this.GenerateUnorderedSequences(this.Target, GetVectorNorm(this.Target, this.Target), this.MaxVectorsCount, 0)
#if !SINGLE_THREADED
.AsParallel()
#endif
.Select(Enumerable.ToArray); .Select(Enumerable.ToArray);
} }

@ -9,11 +9,10 @@
<AppDesignerFolder>Properties</AppDesignerFolder> <AppDesignerFolder>Properties</AppDesignerFolder>
<RootNamespace>WhiteRabbit</RootNamespace> <RootNamespace>WhiteRabbit</RootNamespace>
<AssemblyName>WhiteRabbit</AssemblyName> <AssemblyName>WhiteRabbit</AssemblyName>
<TargetFrameworkVersion>v4.7</TargetFrameworkVersion> <TargetFrameworkVersion>v4.6</TargetFrameworkVersion>
<FileAlignment>512</FileAlignment> <FileAlignment>512</FileAlignment>
<AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects> <AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks> <AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<TargetFrameworkProfile />
</PropertyGroup> </PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
<PlatformTarget>x64</PlatformTarget> <PlatformTarget>x64</PlatformTarget>
@ -61,6 +60,7 @@
<Compile Include="ByteArrayEqualityComparer.cs" /> <Compile Include="ByteArrayEqualityComparer.cs" />
<Compile Include="Constants.cs" /> <Compile Include="Constants.cs" />
<Compile Include="Flattener.cs" /> <Compile Include="Flattener.cs" />
<Compile Include="MD5Digest.cs" />
<Compile Include="PhraseSet.cs" /> <Compile Include="PhraseSet.cs" />
<Compile Include="PrecomputedPermutationsGenerator.cs" /> <Compile Include="PrecomputedPermutationsGenerator.cs" />
<Compile Include="PermutationsGenerator.cs" /> <Compile Include="PermutationsGenerator.cs" />
@ -69,7 +69,6 @@
<Compile Include="Properties\AssemblyInfo.cs" /> <Compile Include="Properties\AssemblyInfo.cs" />
<Compile Include="VectorsProcessor.cs" /> <Compile Include="VectorsProcessor.cs" />
<Compile Include="VectorsConverter.cs" /> <Compile Include="VectorsConverter.cs" />
<Compile Include="Word.cs" />
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>
<None Include="App.config" /> <None Include="App.config" />

@ -1,53 +0,0 @@
namespace WhiteRabbit
{
internal unsafe struct Word
{
public fixed long Buffers[128];
public unsafe Word(byte[] word)
{
var tmpWord = new byte[word.Length + 1];
tmpWord[word.Length] = (byte)' ';
for (var i = 0; i < word.Length; i++)
{
tmpWord[i] = word[i];
}
fixed (long* buffersPointer = this.Buffers)
{
for (var i = 0; i < 32; i++)
{
var bytePointer = (byte*)(buffersPointer + 4 * i);
var endPointer = bytePointer + 32;
var currentPointer = bytePointer + i;
for (var j = 0; j < tmpWord.Length && currentPointer < endPointer; j++, currentPointer++)
{
*currentPointer = tmpWord[j];
}
}
buffersPointer[127] = tmpWord.Length * 4;
}
}
public unsafe byte[] Original
{
get
{
fixed (long* buffersPointer = this.Buffers)
{
var length = buffersPointer[127] / 4;
var result = new byte[length];
for (var i = 0; i < length; i++)
{
result[i] = ((byte*)buffersPointer)[i];
}
return result;
}
}
}
private static Word Empty { get; } = new Word();
}
}
Loading…
Cancel
Save