Compare commits

..

1 Commits

Author SHA1 Message Date
Inga 🏳‍🌈 b9499ce4a2 Hi flocal 7 years ago
  1. 71
      README.md
  2. 29
      dotnet/TrustPilotChallenge.sln
  3. 38
      dotnet/WhiteRabbit.UnmanagedBridge/AssemblyInfo.cpp
  4. 5
      dotnet/WhiteRabbit.UnmanagedBridge/Stdafx.cpp
  5. 7
      dotnet/WhiteRabbit.UnmanagedBridge/Stdafx.h
  6. 32
      dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.cpp
  7. 18
      dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.h
  8. 158
      dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.vcxproj
  9. 50
      dotnet/WhiteRabbit.UnmanagedBridge/WhiteRabbit.UnmanagedBridge.vcxproj.filters
  10. 3
      dotnet/WhiteRabbit.UnmanagedBridge/constants.h
  11. 236
      dotnet/WhiteRabbit.UnmanagedBridge/md5.cpp
  12. 3
      dotnet/WhiteRabbit.UnmanagedBridge/md5.h
  13. 86
      dotnet/WhiteRabbit.UnmanagedBridge/phraseset.cpp
  14. 3
      dotnet/WhiteRabbit.UnmanagedBridge/phraseset.h
  15. 3
      dotnet/WhiteRabbit.UnmanagedBridge/resource.h
  16. 8
      dotnet/WhiteRabbit/App.config
  17. 11
      dotnet/WhiteRabbit/Constants.cs
  18. 262
      dotnet/WhiteRabbit/Flattener.cs
  19. 123
      dotnet/WhiteRabbit/MD5Digest.cs
  20. 4
      dotnet/WhiteRabbit/PermutationsGenerator.cs
  21. 52
      dotnet/WhiteRabbit/Phrase.cs
  22. 126
      dotnet/WhiteRabbit/PhraseSet.cs
  23. 157
      dotnet/WhiteRabbit/PrecomputedPermutationsGenerator.cs
  24. 114
      dotnet/WhiteRabbit/Program.cs
  25. 99
      dotnet/WhiteRabbit/StringsProcessor.cs
  26. 73
      dotnet/WhiteRabbit/VectorsProcessor.cs
  27. 18
      dotnet/WhiteRabbit/WhiteRabbit.csproj
  28. 53
      dotnet/WhiteRabbit/Word.cs

@ -34,34 +34,33 @@ WhiteRabbit.exe < wordlist
Performance
===========
Memory usage is minimal (for that kind of task), less than 10MB (25MB for MaxNumberOfWords = 8).
Memory usage is minimal (for that kind of task), less than 10MB.
It is also somewhat optimized for likely intended phrases, as anagrams consisting of longer words are generated first.
That's why the given hashes are solved much sooner than it takes to check all anagrams.
Anagrams generation is not parallelized, as even single-threaded performance for 4-word anagrams is high enough; and 5-word (or larger) anagrams are frequent enough for most of the time being spent on computing hashes, with full CPU load.
Multi-threaded performance with RyuJIT (.NET 4.6, 64-bit system) on i5-6500 is as follows (excluding initialization time of 0.2 seconds), for different maximum allowed words in an anagram:
Multi-threaded performance with RyuJIT (.NET 4.6, 64-bit system) on quad-core Sandy Bridge @2.8GHz is as follows (excluding initialization time of 0.2 seconds):
Number of words|Time to check all anagrams no longer than that|Time to solve "easy" hash|Time to solve "more difficult" hash|Time to solve "hard" hash|Number of unique anagrams no longer than that
---------------|----------------------------------------------|-------------------------|-----------------------------------|-------------------------|---------------------------------------------
3|0.04s||||4560
4|0.45s|||0.08s|7,431,984
5|9.6s|0.15s|0.06s|0.27s|1,347,437,484
6|4.5 minutes|0.85s|0.17s|2.05s|58,405,904,844
7|83 minutes|4.7s|0.6s|13.3s|1,070,307,744,114
8|14 hours|17.6s|1.8s|55s|10,893,594,396,594
9||45s|4s|2.5 minutes|70,596,864,409,954
10||80s|5.8s|4.8 minutes|314,972,701,475,754
* If only phrases of at most 4 words are allowed, then it takes **1.1 seconds** to find and check all 7433016 anagrams; **all hashes are solved in first 0.2 seconds**.
* If phrases of 5 words are allowed as well, then it takes 2:45 minutes to find and check all 1348876896 anagrams; all hashes are solved in first 4 seconds.
* If phrases of 6 words are allowed as well, then "more difficult" hash is solved in 3.5 seconds, "easiest" in 21 seconds, and "hard" in 54 seconds.
* If phrases of 7 words are allowed as well, then "more difficult" hash is solved in 20 seconds, "easiest" in less than 2.5 minutes, and "hard" in 6:45 minutes.
Note that all measurements were done on a Release build; Debug build is significantly slower.
For comparison, certain other solutions available on GitHub seem to require 3 hours to find all 3-word anagrams. This solution is faster by 6-7 orders of magnitude (it finds and checks all 4-word anagrams in 1/10000th fraction of time required for other solution just to find all 3-word anagrams, with no MD5 calculations).
For comparison, certain other solutions available on GitHub seem to require 3 hours to find all 3-word anagrams. This solution is faster by 5-7 orders of magnitude (it finds and checks all 4-word anagrams in 1/2000th fraction of time required for other solution just to find all 3-word anagrams, with no MD5 calculations).
Conditional compilation symbols
===============================
* Define `DEBUG`, or build in debug mode, to get the total number of anagrams (not optimized).
* Define `SINGLE_THREADED` to use standard enumerables instead of ParallelEnumerable (useful for profiling).
* Define `DEBUG`, or build in debug mode, to get the total number of anagrams (not optimized, memory-hogging).
Implementation notes
====================
@ -112,46 +111,4 @@ There is no need in processing all the words that are too large to be useful at
11. Filtering the original dictionary (e.g. throwing away all single-letter words) does not really improve the performance, thanks to the optimizations mentioned in notes 7-9.
This solution finds all anagrams, including those with single-letter words.
12. Computing the entire MD5, and then comparing it to the target MD5s, makes little sense. Each of MD5 components is `uint`, which means that the chances of first component match for different hashes are one in 4 billions.
It's more efficient to compute only the first component (which is 5% faster since we don't need to perform rounds 62-64 of MD5), and use only the first component for a lookup (which makes the lookup 4x faster).
To prevent false positives, we could compute the entire MD5 again if there is a match.
As that will only happen once in 4 billion hashes, the efficiency of this computation does not matter at all.
Right now, this additional checking is not implemented, which means that once in a minute (if there are 3 target hashes) the program will produce a false positive, which allows one to monitor progress.
13. MD5 computation is further optimized by leveraging CPU extensions.
For example, one could compute MD5 more effectively by using `rotl` instruction to rotate numbers (which is currently done with two bitshifts and one `or` / `xor`).
What's more important, one could compute 4 hashes at once (on a single core) using SSE, 8 hashes at once using AVX2, or 16 hashes at once using AVX512 (AVX lacks enough instructions to make computing hashes feasible).
.NET/RyuJit does not support some of the required intrinsics (`rotl` for plain MD5 implementation, `psrld` and `pslld` for SSE, and similar intrinsics for AVX2).
Although `rotl` support is expected in next release of RyuJIT (see https://github.com/dotnet/coreclr/pull/1830), no support for bitshift SIMD/AVX2 instructions is currently expected (see https://github.com/dotnet/coreclr/issues/3226).
However, one can move MD5 computations to the unmanaged C++ code, where all the intrinsics are available.
To make this work efficiently, I had to store anagrams in chunks of 8 anagrams (so that unmanaged code will receive the chunk and produce 8 hashes).
And to make this efficient, I had to make all permutation counts to divide by 8 by filling in some additional permutation copies.
It slows down processing anagrams of 1, 2, and 3 words (as for every set of word, number of anagrams is increased to 8 from 1, 2 and 6, respectively); however, these are relatively rare for a given phrase and dictionary.
Implementation details
======================
Given all the above, the implementation is as follows:
1. Words from the dictionary are converted into arrays of bytes with a trailing space.
2. The dictionary is filtered from words that could not be a part of anagram (e.g. "b" or "aa"), and from duplicates.
3. Words are converted into vectors, and grouped by vector.
4. Vectors are ordered by their norm, in a descending order.
5. All sequences of non-decreasing vector indices adding up to a target vector are found.
6. For every sequence, a sequence of word arrays corresponging to these vectors is generated.
7. For every sequence of word arrays, all sequences of word combinations are generated (e.g. for [[ab, ba], [cd, dc]], we generate [ab, cd], [ab, dc], [ba, cd], [ba, dc]).
8. For every sequence of words, all permutations are generated (in chunks of 8).
9. For every 8 permuted sequences of words, `uint[64]` message is generated (8 uints = 28 bytes with a trailing `128` byte, plus a length in bits for every sequence).
10. For every `uint[64]` message, 8 `uint`s corresponding to the first components of MD5 hashes for `uint[8]` messages are generated.
11. Every resulting `uint` is checked against the targets; if match is found, both sequence of word and full MD5 hash are printed to the output.
12. MD5 computation could be further optimized by leveraging CPU extensions; however, it could not be done with current .NET (see readme for https://github.com/penartur/TrustPilotChallenge/tree/simd-md5)

@ -1,45 +1,20 @@

Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio 15
VisualStudioVersion = 15.0.26403.3
# Visual Studio 14
VisualStudioVersion = 14.0.24720.0
MinimumVisualStudioVersion = 10.0.40219.1
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "WhiteRabbit", "WhiteRabbit\WhiteRabbit.csproj", "{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}"
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "WhiteRabbit.UnmanagedBridge", "WhiteRabbit.UnmanagedBridge\WhiteRabbit.UnmanagedBridge.vcxproj", "{039F03A0-7E8F-415D-8180-969D24479B44}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Debug|x64 = Debug|x64
Debug|x86 = Debug|x86
Release|Any CPU = Release|Any CPU
Release|x64 = Release|x64
Release|x86 = Release|x86
EndGlobalSection
GlobalSection(ProjectConfigurationPlatforms) = postSolution
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|Any CPU.Build.0 = Debug|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|x64.ActiveCfg = Debug|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|x64.Build.0 = Debug|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|x86.ActiveCfg = Debug|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Debug|x86.Build.0 = Debug|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|Any CPU.ActiveCfg = Release|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|Any CPU.Build.0 = Release|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|x64.ActiveCfg = Release|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|x64.Build.0 = Release|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|x86.ActiveCfg = Release|Any CPU
{3A4E69F0-7A8E-4B92-BA02-A231D75CB3E4}.Release|x86.Build.0 = Release|Any CPU
{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|Any CPU.ActiveCfg = Debug|Win32
{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|x64.ActiveCfg = Debug|x64
{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|x64.Build.0 = Debug|x64
{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|x86.ActiveCfg = Debug|Win32
{039F03A0-7E8F-415D-8180-969D24479B44}.Debug|x86.Build.0 = Debug|Win32
{039F03A0-7E8F-415D-8180-969D24479B44}.Release|Any CPU.ActiveCfg = Release|x64
{039F03A0-7E8F-415D-8180-969D24479B44}.Release|Any CPU.Build.0 = Release|x64
{039F03A0-7E8F-415D-8180-969D24479B44}.Release|x64.ActiveCfg = Release|x64
{039F03A0-7E8F-415D-8180-969D24479B44}.Release|x64.Build.0 = Release|x64
{039F03A0-7E8F-415D-8180-969D24479B44}.Release|x86.ActiveCfg = Release|Win32
{039F03A0-7E8F-415D-8180-969D24479B44}.Release|x86.Build.0 = Release|Win32
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE

@ -1,38 +0,0 @@
#include "stdafx.h"
using namespace System;
using namespace System::Reflection;
using namespace System::Runtime::CompilerServices;
using namespace System::Runtime::InteropServices;
using namespace System::Security::Permissions;
//
// General Information about an assembly is controlled through the following
// set of attributes. Change these attribute values to modify the information
// associated with an assembly.
//
[assembly:AssemblyTitleAttribute(L"WhiteRabbitUnmanagedBridge")];
[assembly:AssemblyDescriptionAttribute(L"")];
[assembly:AssemblyConfigurationAttribute(L"")];
[assembly:AssemblyCompanyAttribute(L"")];
[assembly:AssemblyProductAttribute(L"WhiteRabbitUnmanagedBridge")];
[assembly:AssemblyCopyrightAttribute(L"Copyright (c) 2017")];
[assembly:AssemblyTrademarkAttribute(L"")];
[assembly:AssemblyCultureAttribute(L"")];
//
// Version information for an assembly consists of the following four values:
//
// Major Version
// Minor Version
// Build Number
// Revision
//
// You can specify all the value or you can default the Revision and Build Numbers
// by using the '*' as shown below:
[assembly:AssemblyVersionAttribute("1.0.*")];
[assembly:ComVisible(false)];
[assembly:CLSCompliantAttribute(true)];

@ -1,5 +0,0 @@
// stdafx.cpp : source file that includes just the standard includes
// WhiteRabbit.Unmanaged.pch will be the pre-compiled header
// stdafx.obj will contain the pre-compiled type information
#include "stdafx.h"

@ -1,7 +0,0 @@
// stdafx.h : include file for standard system include files,
// or project specific include files that are used frequently,
// but are changed infrequently
#pragma once

@ -1,32 +0,0 @@
// This is the main DLL file.
#include "stdafx.h"
#include "WhiteRabbit.UnmanagedBridge.h"
#include "md5.h"
#include "phraseset.h"
void WhiteRabbitUnmanagedBridge::MD5Unmanaged::ComputeMD5(unsigned __int32 * input, unsigned __int32 * expected)
{
#if AVX2
md5(input + 0 * 8 * 8, expected);
#elif SIMD
md5(input + 0 * 8 * 4);
md5(input + 1 * 8 * 4);
if (input[2 * 8 * 4] != 0)
{
md5(input + 2 * 8 * 4);
md5(input + 3 * 8 * 4);
}
#else
for (int i = 0; i < 16; i++)
{
md5(input + i * 8);
}
#endif
}
void WhiteRabbitUnmanagedBridge::MD5Unmanaged::FillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords)
{
fillPhraseSet(initialBufferPointer, bufferPointer, allWordsPointer, wordIndexes, permutationsPointer, numberOfWords);
}

@ -1,18 +0,0 @@
// WhiteRabbit.Unmanaged.h
#pragma once
#include "constants.h"
using namespace System;
namespace WhiteRabbitUnmanagedBridge {
public ref class MD5Unmanaged
{
public:
literal int PhrasesPerSet = PHRASES_PER_SET;
static void ComputeMD5(unsigned int* input, unsigned __int32 * expected);
static void FillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords);
};
}

@ -1,158 +0,0 @@
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="14.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="Debug|Win32">
<Configuration>Debug</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|Win32">
<Configuration>Release</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Debug|x64">
<Configuration>Debug</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|x64">
<Configuration>Release</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
</ItemGroup>
<PropertyGroup Label="Globals">
<ProjectGuid>{039F03A0-7E8F-415D-8180-969D24479B44}</ProjectGuid>
<TargetFrameworkVersion>v4.7</TargetFrameworkVersion>
<Keyword>ManagedCProj</Keyword>
<RootNamespace>WhiteRabbitUnmanagedBridge</RootNamespace>
<WindowsTargetPlatformVersion>10.0.10586.0</WindowsTargetPlatformVersion>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset>
<CLRSupport>true</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset>
<CLRSupport>true</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset>
<CLRSupport>true</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset>
<CLRSupport>true</CLRSupport>
<CharacterSet>Unicode</CharacterSet>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="Shared">
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<PropertyGroup Label="UserMacros" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<LinkIncremental>true</LinkIncremental>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<LinkIncremental>true</LinkIncremental>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<LinkIncremental>false</LinkIncremental>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<LinkIncremental>false</LinkIncremental>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<PreprocessorDefinitions>WIN32;_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<PrecompiledHeader>Use</PrecompiledHeader>
</ClCompile>
<Link>
<AdditionalDependencies />
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<Optimization>Disabled</Optimization>
<PreprocessorDefinitions>_DEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<PrecompiledHeader>Use</PrecompiledHeader>
</ClCompile>
<Link>
<AdditionalDependencies />
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<PreprocessorDefinitions>WIN32;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<PrecompiledHeader>Use</PrecompiledHeader>
</ClCompile>
<Link>
<AdditionalDependencies />
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<PreprocessorDefinitions>SIMD=true;NDEBUG;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<PrecompiledHeader>Use</PrecompiledHeader>
<Optimization>Full</Optimization>
<InlineFunctionExpansion>AnySuitable</InlineFunctionExpansion>
<IntrinsicFunctions>true</IntrinsicFunctions>
<FavorSizeOrSpeed>Speed</FavorSizeOrSpeed>
<AssemblerOutput>AssemblyAndSourceCode</AssemblerOutput>
<EnableEnhancedInstructionSet>StreamingSIMDExtensions2</EnableEnhancedInstructionSet>
</ClCompile>
<Link>
<AdditionalDependencies />
</Link>
</ItemDefinitionGroup>
<ItemGroup>
<ClInclude Include="constants.h" />
<ClInclude Include="md5.h" />
<ClInclude Include="phraseset.h" />
<ClInclude Include="resource.h" />
<ClInclude Include="Stdafx.h" />
<ClInclude Include="WhiteRabbit.UnmanagedBridge.h" />
</ItemGroup>
<ItemGroup>
<ClCompile Include="AssemblyInfo.cpp" />
<ClCompile Include="md5.cpp" />
<ClCompile Include="phraseset.cpp" />
<ClCompile Include="Stdafx.cpp">
<PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">Create</PrecompiledHeader>
<PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">Create</PrecompiledHeader>
<PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">Create</PrecompiledHeader>
<PrecompiledHeader Condition="'$(Configuration)|$(Platform)'=='Release|x64'">Create</PrecompiledHeader>
</ClCompile>
<ClCompile Include="WhiteRabbit.UnmanagedBridge.cpp" />
</ItemGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
</ImportGroup>
</Project>

@ -1,50 +0,0 @@
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup>
<Filter Include="Source Files">
<UniqueIdentifier>{4FC737F1-C7A5-4376-A066-2A32D752A2FF}</UniqueIdentifier>
<Extensions>cpp;c;cc;cxx;def;odl;idl;hpj;bat;asm;asmx</Extensions>
</Filter>
<Filter Include="Header Files">
<UniqueIdentifier>{93995380-89BD-4b04-88EB-625FBE52EBFB}</UniqueIdentifier>
<Extensions>h;hh;hpp;hxx;hm;inl;inc;xsd</Extensions>
</Filter>
</ItemGroup>
<ItemGroup>
<ClInclude Include="Stdafx.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="resource.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="WhiteRabbit.UnmanagedBridge.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="md5.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="constants.h">
<Filter>Header Files</Filter>
</ClInclude>
<ClInclude Include="phraseset.h">
<Filter>Header Files</Filter>
</ClInclude>
</ItemGroup>
<ItemGroup>
<ClCompile Include="AssemblyInfo.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="Stdafx.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="WhiteRabbit.UnmanagedBridge.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="md5.cpp">
<Filter>Source Files</Filter>
</ClCompile>
<ClCompile Include="phraseset.cpp">
<Filter>Source Files</Filter>
</ClCompile>
</ItemGroup>
</Project>

@ -1,3 +0,0 @@
#pragma once
#define PHRASES_PER_SET 16

@ -1,236 +0,0 @@
#include "stdafx.h"
#include "md5.h"
#include "intrin.h"
#pragma unmanaged
struct MD5Vector
{
__m256i m_V0;
__m256i m_V1;
__forceinline MD5Vector() {}
__forceinline MD5Vector(__m256i C0, __m256i C1) :m_V0(C0), m_V1(C1) {}
__forceinline MD5Vector MXor(MD5Vector R) const
{
return MD5Vector(_mm256_xor_si256(m_V0, R.m_V0), _mm256_xor_si256(m_V1, R.m_V1));
}
__forceinline MD5Vector MAnd(MD5Vector R) const
{
return MD5Vector(_mm256_and_si256(m_V0, R.m_V0), _mm256_and_si256(m_V1, R.m_V1));
}
__forceinline MD5Vector MAndNot(MD5Vector R) const
{
return MD5Vector(_mm256_andnot_si256(m_V0, R.m_V0), _mm256_andnot_si256(m_V1, R.m_V1));
}
__forceinline const MD5Vector MOr(const MD5Vector R) const
{
return MD5Vector(_mm256_or_si256(m_V0, R.m_V0), _mm256_or_si256(m_V1, R.m_V1));
}
__forceinline const MD5Vector MAdd(const MD5Vector R) const
{
return MD5Vector(_mm256_add_epi32(m_V0, R.m_V0), _mm256_add_epi32(m_V1, R.m_V1));
}
__forceinline const MD5Vector MShiftLeft(const int shift) const
{
return MD5Vector(_mm256_slli_epi32(m_V0, shift), _mm256_slli_epi32(m_V1, shift));
}
__forceinline const MD5Vector MShiftRight(const int shift) const
{
return MD5Vector(_mm256_srli_epi32(m_V0, shift), _mm256_srli_epi32(m_V1, shift));
}
template<int imm8>
__forceinline const MD5Vector Permute() const
{
return MD5Vector(_mm256_permute4x64_epi64(m_V0, imm8), _mm256_permute4x64_epi64(m_V1, imm8));
}
__forceinline const MD5Vector CompareEquality32(const __m256i other) const
{
return MD5Vector(_mm256_cmpeq_epi32(m_V0, other), _mm256_cmpeq_epi32(m_V1, other));
}
__forceinline void WriteMoveMask8(__int32 * output) const
{
output[0] = _mm256_movemask_epi8(m_V0);
output[1] = _mm256_movemask_epi8(m_V1);
}
};
__forceinline const MD5Vector OP_XOR(const MD5Vector a, const MD5Vector b) { return a.MXor(b); }
__forceinline const MD5Vector OP_AND(const MD5Vector a, const MD5Vector b) { return a.MAnd(b); }
__forceinline const MD5Vector OP_ANDNOT(const MD5Vector a, const MD5Vector b) { return a.MAndNot(b); }
__forceinline const MD5Vector OP_OR(const MD5Vector a, const MD5Vector b) { return a.MOr(b); }
__forceinline const MD5Vector OP_ADD(const MD5Vector a, const MD5Vector b) { return a.MAdd(b); }
template<int r>
__forceinline const MD5Vector OP_ROT(const MD5Vector a) { return OP_OR(a.MShiftLeft(r), a.MShiftRight(32 - (r))); }
__forceinline const MD5Vector OP_BLEND(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_OR(OP_AND(x, b), OP_ANDNOT(x, a)); }
__forceinline const MD5Vector CREATE_VECTOR(const int a) { return MD5Vector(_mm256_set1_epi32(a), _mm256_set1_epi32(a)); }
__forceinline const MD5Vector CREATE_VECTOR_FROM_INPUT(const unsigned __int32* input, const size_t offset)
{
return MD5Vector(
_mm256_i32gather_epi32((int*)(input + offset), _mm256_set_epi32(7 * 8, 6 * 8, 5 * 8, 4 * 8, 3 * 8, 2 * 8, 1 * 8, 0 * 8), 4),
_mm256_i32gather_epi32((int*)(input + offset), _mm256_set_epi32(15 * 8, 14 * 8, 13 * 8, 12 * 8, 11 * 8, 10 * 8, 9 * 8, 8 * 8), 4));
}
#define WRITE_TO_OUTPUT(a, output, expected) \
a.Permute<0 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output); \
a.Permute<1 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 2); \
a.Permute<2 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 4); \
a.Permute<3 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output + 6); \
output[8] = _mm256_movemask_epi8(_mm256_cmpeq_epi8(*((__m256i*)output), _mm256_setzero_si256()));
__forceinline void WriteToOutput(const MD5Vector a, __int32 * output, __m256i * expected)
{
a.Permute<0 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
a.Permute<1 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
a.Permute<2 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
a.Permute<3 * 0x55>().CompareEquality32(*expected).WriteMoveMask8(output);
output[8] = _mm256_movemask_epi8(_mm256_cmpeq_epi8(*((__m256i*)output), _mm256_setzero_si256()));
}
const MD5Vector Ones = CREATE_VECTOR(0xffffffff);
__forceinline const MD5Vector OP_NEG(const MD5Vector a) { return OP_ANDNOT(a, Ones); }
__forceinline const MD5Vector Blend(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_BLEND(a, b, x); }
__forceinline const MD5Vector Xor(const MD5Vector a, const MD5Vector b, const MD5Vector c) { return OP_XOR(a, OP_XOR(b, c)); }
__forceinline const MD5Vector I(const MD5Vector a, const MD5Vector b, const MD5Vector c) { return OP_XOR(a, OP_OR(b, OP_NEG(c))); }
template<int r>
__forceinline const MD5Vector StepOuter(const MD5Vector a, const MD5Vector b, const MD5Vector x) { return OP_ADD(b, OP_ROT<r>(x)); }
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step1(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
return StepOuter<r>(a, b, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step1(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
return StepOuter<r>(a, b, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), a)));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step2(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
return StepOuter<r>(a, c, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step2(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
return StepOuter<r>(a, c, OP_ADD(Blend(d, c, b), OP_ADD(CREATE_VECTOR(k), a)));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step3(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
return StepOuter<r>(a, b, OP_ADD(Xor(b, c, d), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step3(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
return StepOuter<r>(a, b, OP_ADD(Xor(b, c, d), OP_ADD(CREATE_VECTOR(k), a)));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step4(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d, const MD5Vector w) {
return StepOuter<r>(a, b, OP_ADD(I(c, b, d), OP_ADD(CREATE_VECTOR(k), OP_ADD(a, w))));
}
template<int r, unsigned __int32 k>
__forceinline const MD5Vector Step4(const MD5Vector a, const MD5Vector b, const MD5Vector c, const MD5Vector d) {
return StepOuter<r>(a, b, OP_ADD(I(c, b, d), OP_ADD(CREATE_VECTOR(k), a)));
}
void md5(unsigned __int32 * input, unsigned __int32 * expected)
{
MD5Vector a = CREATE_VECTOR(0x67452301);
MD5Vector b = CREATE_VECTOR(0xefcdab89);
MD5Vector c = CREATE_VECTOR(0x98badcfe);
MD5Vector d = CREATE_VECTOR(0x10325476);
MD5Vector inputVector0 = CREATE_VECTOR_FROM_INPUT(input, 0);
MD5Vector inputVector1 = CREATE_VECTOR_FROM_INPUT(input, 1);
MD5Vector inputVector2 = CREATE_VECTOR_FROM_INPUT(input, 2);
MD5Vector inputVector3 = CREATE_VECTOR_FROM_INPUT(input, 3);
MD5Vector inputVector4 = CREATE_VECTOR_FROM_INPUT(input, 4);
MD5Vector inputVector5 = CREATE_VECTOR_FROM_INPUT(input, 5);
MD5Vector inputVector6 = CREATE_VECTOR_FROM_INPUT(input, 6);
MD5Vector inputVector7 = CREATE_VECTOR_FROM_INPUT(input, 7);
a = Step1< 7, 0xd76aa478>(a, b, c, d, inputVector0);
d = Step1<12, 0xe8c7b756>(d, a, b, c, inputVector1);
c = Step1<17, 0x242070db>(c, d, a, b, inputVector2);
b = Step1<22, 0xc1bdceee>(b, c, d, a, inputVector3);
a = Step1< 7, 0xf57c0faf>(a, b, c, d, inputVector4);
d = Step1<12, 0x4787c62a>(d, a, b, c, inputVector5);
c = Step1<17, 0xa8304613>(c, d, a, b, inputVector6);
b = Step1<22, 0xfd469501>(b, c, d, a);
a = Step1< 7, 0x698098d8>(a, b, c, d);
d = Step1<12, 0x8b44f7af>(d, a, b, c);
c = Step1<17, 0xffff5bb1>(c, d, a, b);
b = Step1<22, 0x895cd7be>(b, c, d, a);
a = Step1< 7, 0x6b901122>(a, b, c, d);
d = Step1<12, 0xfd987193>(d, a, b, c);
c = Step1<17, 0xa679438e>(c, d, a, b, inputVector7);
b = Step1<22, 0x49b40821>(b, c, d, a);
a = Step2< 5, 0xf61e2562>(a, d, b, c, inputVector1);
d = Step2< 9, 0xc040b340>(d, c, a, b, inputVector6);
c = Step2<14, 0x265e5a51>(c, b, d, a);
b = Step2<20, 0xe9b6c7aa>(b, a, c, d, inputVector0);
a = Step2< 5, 0xd62f105d>(a, d, b, c, inputVector5);
d = Step2< 9, 0x02441453>(d, c, a, b);
c = Step2<14, 0xd8a1e681>(c, b, d, a);
b = Step2<20, 0xe7d3fbc8>(b, a, c, d, inputVector4);
a = Step2< 5, 0x21e1cde6>(a, d, b, c);
d = Step2< 9, 0xc33707d6>(d, c, a, b, inputVector7);
c = Step2<14, 0xf4d50d87>(c, b, d, a, inputVector3);
b = Step2<20, 0x455a14ed>(b, a, c, d);
a = Step2< 5, 0xa9e3e905>(a, d, b, c);
d = Step2< 9, 0xfcefa3f8>(d, c, a, b, inputVector2);
c = Step2<14, 0x676f02d9>(c, b, d, a);
b = Step2<20, 0x8d2a4c8a>(b, a, c, d);
a = Step3< 4, 0xfffa3942>(a, b, c, d, inputVector5);
d = Step3<11, 0x8771f681>(d, a, b, c);
c = Step3<16, 0x6d9d6122>(c, d, a, b);
b = Step3<23, 0xfde5380c>(b, c, d, a, inputVector7);
a = Step3< 4, 0xa4beea44>(a, b, c, d, inputVector1);
d = Step3<11, 0x4bdecfa9>(d, a, b, c, inputVector4);
c = Step3<16, 0xf6bb4b60>(c, d, a, b);
b = Step3<23, 0xbebfbc70>(b, c, d, a);
a = Step3< 4, 0x289b7ec6>(a, b, c, d);
d = Step3<11, 0xeaa127fa>(d, a, b, c, inputVector0);
c = Step3<16, 0xd4ef3085>(c, d, a, b, inputVector3);
b = Step3<23, 0x04881d05>(b, c, d, a, inputVector6);
a = Step3< 4, 0xd9d4d039>(a, b, c, d);
d = Step3<11, 0xe6db99e5>(d, a, b, c);
c = Step3<16, 0x1fa27cf8>(c, d, a, b);
b = Step3<23, 0xc4ac5665>(b, c, d, a, inputVector2);
a = Step4< 6, 0xf4292244>(a, b, c, d, inputVector0);
d = Step4<10, 0x432aff97>(d, a, b, c);
c = Step4<15, 0xab9423a7>(c, d, a, b, inputVector7);
b = Step4<21, 0xfc93a039>(b, c, d, a, inputVector5);
a = Step4< 6, 0x655b59c3>(a, b, c, d);
d = Step4<10, 0x8f0ccc92>(d, a, b, c, inputVector3);
c = Step4<15, 0xffeff47d>(c, d, a, b);
b = Step4<21, 0x85845dd1>(b, c, d, a, inputVector1);
a = Step4< 6, 0x6fa87e4f>(a, b, c, d);
d = Step4<10, 0xfe2ce6e0>(d, a, b, c);
c = Step4<15, 0xa3014314>(c, d, a, b, inputVector6);
b = Step4<21, 0x4e0811a1>(b, c, d, a);
a = Step4< 6, 0xf7537e82>(a, b, c, d, inputVector4);
a = OP_ADD(CREATE_VECTOR(0x67452301), a);
WRITE_TO_OUTPUT(a, ((__int32*)input), ((__m256i*)expected));
}
#pragma managed

@ -1,3 +0,0 @@
#pragma once
void md5(unsigned int* input, unsigned __int32 * expected);

@ -1,86 +0,0 @@
#include "stdafx.h"
#include "phraseset.h"
#include "constants.h"
#include "intrin.h"
#pragma unmanaged
template<int numberOfWords>
class Processor
{
public:
template<int wordNumber>
static __forceinline const __m256i ProcessWord(const __m256i phrase, const unsigned __int64 cumulativeWordOffset, const unsigned __int64 permutation, unsigned __int64* allWordsPointer, __int32* wordIndexes)
{
auto currentWord = allWordsPointer + wordIndexes[_bextr_u64(permutation, 4 * wordNumber, 4)] * 128;
return ProcessWord<wordNumber + 1>(
_mm256_xor_si256(phrase, *(__m256i*)(currentWord + cumulativeWordOffset)),
cumulativeWordOffset + currentWord[127],
permutation,
allWordsPointer,
wordIndexes);
}
template<>
static __forceinline const __m256i ProcessWord<numberOfWords>(const __m256i phrase, const unsigned __int64 cumulativeWordOffset, const unsigned __int64 permutation, unsigned __int64* allWordsPointer, __int32* wordIndexes)
{
return phrase;
}
template<int phraseNumber>
static __forceinline void ProcessWordsForPhrase(__m256i* avx2initialBuffer, __m256i* avx2buffer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer)
{
avx2buffer[phraseNumber] = ProcessWord<0>(*avx2initialBuffer, 0, permutationsPointer[phraseNumber], allWordsPointer, wordIndexes);
ProcessWordsForPhrase<phraseNumber + 1>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
}
template<>
static __forceinline void ProcessWordsForPhrase<PHRASES_PER_SET>(__m256i* avx2initialBuffer, __m256i* avx2buffer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer)
{
return;
}
};
void fillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords)
{
auto avx2initialBuffer = (__m256i*)initialBufferPointer;
auto avx2buffer = (__m256i*)bufferPointer;
switch (numberOfWords)
{
case 1:
Processor<1>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 2:
Processor<2>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 3:
Processor<3>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 4:
Processor<4>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 5:
Processor<5>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 6:
Processor<6>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 7:
Processor<7>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 8:
Processor<8>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 9:
Processor<9>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
case 10:
Processor<10>::ProcessWordsForPhrase<0>(avx2initialBuffer, avx2buffer, allWordsPointer, wordIndexes, permutationsPointer);
break;
}
}
#pragma managed

@ -1,3 +0,0 @@
#pragma once
void fillPhraseSet(unsigned __int64* initialBufferPointer, unsigned __int64* bufferPointer, unsigned __int64* allWordsPointer, __int32* wordIndexes, unsigned __int64* permutationsPointer, int numberOfWords);

@ -1,3 +0,0 @@
//{{NO_DEPENDENCIES}}
// Microsoft Visual C++ generated include file.
// Used by app.rc

@ -1,11 +1,11 @@
<?xml version="1.0" encoding="utf-8"?>
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<startup>
<supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.7"/>
<supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.6" />
</startup>
<appSettings>
<add key="SourcePhrase" value="poultry outwits ants"/>
<add key="MaxWordsInPhrase" value="5"/>
<add key="SourcePhrase" value="poultry outwits ants" />
<add key="MaxWordsInPhrase" value="5" />
<add key="ExpectedHashes" value="e4820b45d2277f3844eac66c903e84be,23170acc097c24edb98fc5488ab033fe,665e5bcb0c20062fe8abaaf4628bb154,e8a2cbb6206fc937082bb92e4ed9cd3d,74a613b8c64fb216dc22d4f2bd4965f4,ccb5ed231ba04d750c963668391d1e61,d864ae0e66c89cb78345967cb2f3ab6b,2b56477105d91076030e877c94dd9776,732442feac8b5013e16a776486ac5447"/>
</appSettings>
</configuration>

@ -1,11 +0,0 @@
namespace WhiteRabbit
{
internal class Constants
{
public const int PhrasesPerSet = WhiteRabbitUnmanagedBridge.MD5Unmanaged.PhrasesPerSet;
public const int MaxNumberOfWords = 8;
public const int NumberOfThreads = 4;
}
}

@ -1,6 +1,5 @@
namespace WhiteRabbit
{
using System;
using System.Collections.Generic;
using System.Collections.Immutable;
using System.Linq;
@ -10,164 +9,157 @@
/// </summary>
internal static class Flattener
{
// Slow universal implementation
private static IEnumerable<ImmutableStack<T>> FlattenAny<T>(ImmutableStack<T[]> phrase)
{
if (phrase.IsEmpty)
{
return new[] { ImmutableStack.Create<T>() };
}
T[] wordVariants;
var newStack = phrase.Pop(out wordVariants);
return FlattenAny(newStack).SelectMany(remainder => wordVariants.Select(word => remainder.Push(word)));
}
// Fast hard-coded implementation for 3 words
private static IEnumerable<T[]> Flatten3<T>(T[][] phrase)
{
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
yield return new T[]
{
foreach (var item1 in phrase[1])
{
item0,
item1,
item2,
};
foreach (var item2 in phrase[2])
{
yield return new T[]
{
item0,
item1,
item2,
};
}
}
}
}
// Fast hard-coded implementation for 4 words
private static IEnumerable<T[]> Flatten4<T>(T[][] phrase)
{
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
yield return new T[]
{
foreach (var item1 in phrase[1])
{
item0,
item1,
item2,
item3,
};
foreach (var item2 in phrase[2])
{
foreach (var item3 in phrase[3])
{
yield return new T[]
{
item0,
item1,
item2,
item3,
};
}
}
}
}
}
// Fast hard-coded implementation for 5 words
private static IEnumerable<T[]> Flatten5<T>(T[][] phrase)
{
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
yield return new T[]
{
foreach (var item1 in phrase[1])
{
item0,
item1,
item2,
item3,
item4,
};
foreach (var item2 in phrase[2])
{
foreach (var item3 in phrase[3])
{
foreach (var item4 in phrase[4])
{
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
};
}
}
}
}
}
}
// Fast hard-coded implementation for 6 words
private static IEnumerable<T[]> Flatten6<T>(T[][] phrase)
{
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
foreach (var item5 in phrase[5])
yield return new T[]
{
foreach (var item1 in phrase[1])
{
item0,
item1,
item2,
item3,
item4,
item5,
};
foreach (var item2 in phrase[2])
{
foreach (var item3 in phrase[3])
{
foreach (var item4 in phrase[4])
{
foreach (var item5 in phrase[5])
{
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
item5,
};
}
}
}
}
}
}
}
// Fast hard-coded implementation for 7 words
private static IEnumerable<T[]> Flatten7<T>(T[][] phrase)
{
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
foreach (var item5 in phrase[5])
foreach (var item6 in phrase[6])
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
item5,
item6,
};
}
private static IEnumerable<T[]> Flatten8<T>(T[][] phrase)
{
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
foreach (var item5 in phrase[5])
foreach (var item6 in phrase[6])
foreach (var item7 in phrase[7])
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
item5,
item6,
item7,
};
}
private static IEnumerable<T[]> Flatten9<T>(T[][] phrase)
{
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
foreach (var item5 in phrase[5])
foreach (var item6 in phrase[6])
foreach (var item7 in phrase[7])
foreach (var item8 in phrase[8])
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
item5,
item6,
item7,
item8,
};
}
private static IEnumerable<T[]> Flatten10<T>(T[][] phrase)
{
foreach (var item0 in phrase[0])
foreach (var item1 in phrase[1])
foreach (var item2 in phrase[2])
foreach (var item3 in phrase[3])
foreach (var item4 in phrase[4])
foreach (var item5 in phrase[5])
foreach (var item6 in phrase[6])
foreach (var item7 in phrase[7])
foreach (var item8 in phrase[8])
foreach (var item9 in phrase[9])
yield return new T[]
{
foreach (var item1 in phrase[1])
{
item0,
item1,
item2,
item3,
item4,
item5,
item6,
item7,
item8,
item9,
};
foreach (var item2 in phrase[2])
{
foreach (var item3 in phrase[3])
{
foreach (var item4 in phrase[4])
{
foreach (var item5 in phrase[5])
{
foreach (var item6 in phrase[6])
{
yield return new T[]
{
item0,
item1,
item2,
item3,
item4,
item5,
item6,
};
}
}
}
}
}
}
}
}
public static IEnumerable<T[]> Flatten<T>(T[][] wordVariants)
@ -184,14 +176,8 @@
return Flatten6(wordVariants);
case 7:
return Flatten7(wordVariants);
case 8:
return Flatten8(wordVariants);
case 9:
return Flatten9(wordVariants);
case 10:
return Flatten10(wordVariants);
default:
throw new ArgumentOutOfRangeException(nameof(wordVariants));
return FlattenAny(ImmutableStack.Create(wordVariants)).Select(words => words.ToArray());
}
}
}

@ -0,0 +1,123 @@
using System.Runtime.CompilerServices;
namespace WhiteRabbit
{
/**
* Code taken from BouncyCastle and optimized for specific constraints (e.g. input is always larger than 4 bytes and smaller than 52 bytes).
* Further optimization: input could be assumed to be smaller than 27 bytes (original phrase contains 18 letters, so that allows anagrams of 9 words)
* base implementation of MD4 family style digest as outlined in
* "Handbook of Applied Cryptography", pages 344 - 347.
* implementation of MD5 as outlined in "Handbook of Applied Cryptography", pages 346 - 347.
*/
internal static class MD5Digest
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe uint[] Compute(Phrase input)
{
uint a = 0x67452301;
uint b = 0xefcdab89;
uint c = 0x98badcfe;
uint d = 0x10325476;
a = b + LeftRotate(0xd76aa478 + a + Blend(d, c, b) + input.Buffer[0], 7);
d = a + LeftRotate(0xe8c7b756 + d + Blend(c, b, a) + input.Buffer[1], 12);
c = d + LeftRotate(0x242070db + c + Blend(b, a, d) + input.Buffer[2], 17);
b = c + LeftRotate(0xc1bdceee + b + Blend(a, d, c) + input.Buffer[3], 22);
a = b + LeftRotate(0xf57c0faf + a + Blend(d, c, b) + input.Buffer[4], 7);
d = a + LeftRotate(0x4787c62a + d + Blend(c, b, a) + input.Buffer[5], 12);
c = d + LeftRotate(0xa8304613 + c + Blend(b, a, d) + input.Buffer[6], 17);
b = c + LeftRotate(0xfd469501 + b + Blend(a, d, c), 22);
a = b + LeftRotate(0x698098d8 + a + Blend(d, c, b), 7);
d = a + LeftRotate(0x8b44f7af + d + Blend(c, b, a), 12);
c = d + LeftRotate(0xffff5bb1 + c + Blend(b, a, d), 17);
b = c + LeftRotate(0x895cd7be + b + Blend(a, d, c), 22);
a = b + LeftRotate(0x6b901122 + a + Blend(d, c, b), 7);
d = a + LeftRotate(0xfd987193 + d + Blend(c, b, a), 12);
c = d + LeftRotate(0xa679438e + c + Blend(b, a, d) + input.Buffer[7], 17);
b = c + LeftRotate(0x49b40821 + b + Blend(a, d, c), 22);
a = b + LeftRotate(0xf61e2562 + a + Blend(c, b, d) + input.Buffer[1], 5);
d = a + LeftRotate(0xc040b340 + d + Blend(b, a, c) + input.Buffer[6], 9);
c = d + LeftRotate(0x265e5a51 + c + Blend(a, d, b), 14);
b = c + LeftRotate(0xe9b6c7aa + b + Blend(d, c, a) + input.Buffer[0], 20);
a = b + LeftRotate(0xd62f105d + a + Blend(c, b, d) + input.Buffer[5], 5);
d = a + LeftRotate(0x02441453 + d + Blend(b, a, c), 9);
c = d + LeftRotate(0xd8a1e681 + c + Blend(a, d, b), 14);
b = c + LeftRotate(0xe7d3fbc8 + b + Blend(d, c, a) + input.Buffer[4], 20);
a = b + LeftRotate(0x21e1cde6 + a + Blend(c, b, d), 5);
d = a + LeftRotate(0xc33707d6 + d + Blend(b, a, c) + input.Buffer[7], 9);
c = d + LeftRotate(0xf4d50d87 + c + Blend(a, d, b) + input.Buffer[3], 14);
b = c + LeftRotate(0x455a14ed + b + Blend(d, c, a), 20);
a = b + LeftRotate(0xa9e3e905 + a + Blend(c, b, d), 5);
d = a + LeftRotate(0xfcefa3f8 + d + Blend(b, a, c) + input.Buffer[2], 9);
c = d + LeftRotate(0x676f02d9 + c + Blend(a, d, b), 14);
b = c + LeftRotate(0x8d2a4c8a + b + Blend(d, c, a), 20);
a = b + LeftRotate(0xfffa3942 + a + Xor(b, c, d) + input.Buffer[5], 4);
d = a + LeftRotate(0x8771f681 + d + Xor(a, b, c), 11);
c = d + LeftRotate(0x6d9d6122 + c + Xor(d, a, b), 16);
b = c + LeftRotate(0xfde5380c + b + Xor(c, d, a) + input.Buffer[7], 23);
a = b + LeftRotate(0xa4beea44 + a + Xor(b, c, d) + input.Buffer[1], 4);
d = a + LeftRotate(0x4bdecfa9 + d + Xor(a, b, c) + input.Buffer[4], 11);
c = d + LeftRotate(0xf6bb4b60 + c + Xor(d, a, b), 16);
b = c + LeftRotate(0xbebfbc70 + b + Xor(c, d, a), 23);
a = b + LeftRotate(0x289b7ec6 + a + Xor(b, c, d), 4);
d = a + LeftRotate(0xeaa127fa + d + Xor(a, b, c) + input.Buffer[0], 11);
c = d + LeftRotate(0xd4ef3085 + c + Xor(d, a, b) + input.Buffer[3], 16);
b = c + LeftRotate(0x04881d05 + b + Xor(c, d, a) + input.Buffer[6], 23);
a = b + LeftRotate(0xd9d4d039 + a + Xor(b, c, d), 4);
d = a + LeftRotate(0xe6db99e5 + d + Xor(a, b, c), 11);
c = d + LeftRotate(0x1fa27cf8 + c + Xor(d, a, b), 16);
b = c + LeftRotate(0xc4ac5665 + b + Xor(c, d, a) + input.Buffer[2], 23);
a = b + LeftRotate(0xf4292244 + a + I(c, b, d) + input.Buffer[0], 6);
d = a + LeftRotate(0x432aff97 + d + I(b, a, c), 10);
c = d + LeftRotate(0xab9423a7 + c + I(a, d, b) + input.Buffer[7], 15);
b = c + LeftRotate(0xfc93a039 + b + I(d, c, a) + input.Buffer[5], 21);
a = b + LeftRotate(0x655b59c3 + a + I(c, b, d), 6);
d = a + LeftRotate(0x8f0ccc92 + d + I(b, a, c) + input.Buffer[3], 10);
c = d + LeftRotate(0xffeff47d + c + I(a, d, b), 15);
b = c + LeftRotate(0x85845dd1 + b + I(d, c, a) + input.Buffer[1], 21);
a = b + LeftRotate(0x6fa87e4f + a + I(c, b, d), 6);
d = a + LeftRotate(0xfe2ce6e0 + d + I(b, a, c), 10);
c = d + LeftRotate(0xa3014314 + c + I(a, d, b) + input.Buffer[6], 15);
b = c + LeftRotate(0x4e0811a1 + b + I(d, c, a), 21);
a = b + LeftRotate(0xf7537e82 + a + I(c, b, d) + input.Buffer[4], 6);
d = a + LeftRotate(0xbd3af235 + d + I(b, a, c), 10);
c = d + LeftRotate(0x2ad7d2bb + c + I(a, d, b) + input.Buffer[2], 15);
b = c + LeftRotate(0xeb86d391 + b + I(d, c, a), 21);
return new[]
{
0x67452301 + a,
0xefcdab89 + b,
0x98badcfe + c,
0x10325476 + d,
};
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static uint Blend(uint a, uint b, uint x)
{
return (x & b) | (~x & a);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static uint Xor(uint a, uint b, uint c)
{
return a ^ b ^ c;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static uint I(uint a, uint b, uint c)
{
return a ^ (b | ~c);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private static uint LeftRotate(uint x, int left)
{
return (x << left) | (x >> 32 - left);
}
}
}

@ -34,7 +34,9 @@
public static Permutation Empty { get; } = new Permutation(new int[] { });
public int[] PermutationData { get; }
private int[] PermutationData { get; }
public int this[int index] => this.PermutationData[index];
public static IEnumerable<Permutation> HamiltonianPermutationsIterator(int n)
{

@ -0,0 +1,52 @@
namespace WhiteRabbit
{
// Anagram representation optimized for MD5
internal unsafe struct Phrase
{
public fixed uint Buffer[8];
public Phrase(byte[][] words, PermutationsGenerator.Permutation permutation, int numberOfCharacters)
{
fixed (uint* bufferPointer = this.Buffer)
{
var length = numberOfCharacters + words.Length - 1;
byte[] currentWord = words[permutation[0]];
var j = 0;
var wordIndex = 0;
var currentPointer = (byte*)bufferPointer;
byte* lastPointer = currentPointer + length;
for (; currentPointer < lastPointer; currentPointer++)
{
if (j >= currentWord.Length)
{
j = 0;
wordIndex++;
currentWord = words[permutation[wordIndex]];
}
*currentPointer = currentWord[j];
j++;
}
*currentPointer = 128;
bufferPointer[7] = (uint)(length << 3);
}
}
public byte[] GetBytes()
{
fixed(uint* bufferPointer = this.Buffer)
{
var length = bufferPointer[7] >> 3;
var result = new byte[length];
for (var i = 0; i < length; i++)
{
result[i] = ((byte*)bufferPointer)[i];
}
return result;
}
}
}
}

@ -1,126 +0,0 @@
namespace WhiteRabbit
{
using System;
using System.Diagnostics;
using System.Linq;
using System.Numerics;
using System.Runtime.CompilerServices;
using WhiteRabbitUnmanagedBridge;
// Anagram representation optimized for MD5
internal struct PhraseSet
{
private uint[] Buffer;
public void Init()
{
this.Buffer = new uint[8 * Constants.PhrasesPerSet];
}
public unsafe void FillLength(int numberOfCharacters, int numberOfWords)
{
fixed (uint* bufferPointer = this.Buffer)
{
var length = (uint)(numberOfCharacters + numberOfWords - 1);
var lengthInBits = (uint)(length << 3);
for (var i = 0; i < Constants.PhrasesPerSet; i++)
{
bufferPointer[7 + i * 8] = lengthInBits;
((byte*)bufferPointer)[length + i * 32] = 128 ^ ' ';
}
}
}
public unsafe void ProcessPermutations(PhraseSet initialPhraseSet, Word[] allWords, int[] wordIndexes, ulong[] permutations, uint[] expectedHashesVector, Action<byte[], uint> action)
{
fixed (uint* bufferPointer = this.Buffer, initialBufferPointer = initialPhraseSet.Buffer)
{
fixed (ulong* permutationsPointer = permutations)
{
fixed (int* wordIndexesPointer = wordIndexes)
{
fixed (Word* allWordsPointer = allWords)
{
fixed (uint* expectedHashesPointer = expectedHashesVector)
{
for (var i = 0; i < permutations.Length; i += Constants.PhrasesPerSet)
{
MD5Unmanaged.FillPhraseSet(
(ulong*)initialBufferPointer,
(ulong*)bufferPointer,
(ulong*)allWordsPointer,
wordIndexesPointer,
permutationsPointer + i,
wordIndexes.Length);
MD5Unmanaged.ComputeMD5(bufferPointer, expectedHashesPointer);
if (bufferPointer[Constants.PhrasesPerSet / 2] != 0xFFFFFFFF)
{
for (var j = 0; j < Constants.PhrasesPerSet; j++)
{
// 16 matches are packed in 8 32-bit numbers: [0,1], [8,9], [2,3], [10,11], [4, 5], [12, 13], [6, 7], [14, 15]
var position = ((j / 2) % 4) * 2 + (j / 8);
var match = (bufferPointer[position] >> (4 * (j % 2))) & 0xF0F0F0F;
if (match != 0)
{
var bufferInfo = ((ulong)bufferPointer[Constants.PhrasesPerSet] << 32) | bufferPointer[j];
MD5Unmanaged.FillPhraseSet(
(ulong*)initialBufferPointer,
(ulong*)bufferPointer,
(ulong*)allWordsPointer,
wordIndexesPointer,
permutationsPointer + i,
wordIndexes.Length);
action(this.GetBytes(j), match);
break;
}
}
}
}
}
}
}
}
}
}
public unsafe byte[] GetBytes(int number)
{
Debug.Assert(number < Constants.PhrasesPerSet);
fixed (uint* bufferPointer = this.Buffer)
{
var phrasePointer = bufferPointer + 8 * number;
var length = 0;
for (var i = 27; i >= 0; i--)
{
if (((byte*)phrasePointer)[i] == 128)
{
length = i;
break;
}
}
var result = new byte[length];
for (var i = 0; i < length; i++)
{
result[i] = ((byte*)phrasePointer)[i];
}
return result;
}
}
public unsafe string DebugBytes(int number)
{
Debug.Assert(number < Constants.PhrasesPerSet);
fixed (uint* bufferPointer = this.Buffer)
{
var bytes = (byte*)bufferPointer;
return string.Concat(Enumerable.Range(32 * number, 32).Select(i => bytes[i].ToString("X2")));
}
}
}
}

@ -1,157 +1,30 @@
namespace WhiteRabbit
{
using System;
using System.Collections.Generic;
using System.Linq;
internal static class PrecomputedPermutationsGenerator
{
static PrecomputedPermutationsGenerator()
private static PermutationsGenerator.Permutation[][] Permutations { get; } = new[]
{
Permutations = new ulong[Constants.MaxNumberOfWords + 1][][];
PermutationsNumbers = new long[Constants.MaxNumberOfWords + 1][];
for (var i = 0; i <= Constants.MaxNumberOfWords; i++)
{
var permutationsInfo = GeneratePermutations(i);
Permutations[i] = permutationsInfo.Item1;
PermutationsNumbers[i] = permutationsInfo.Item2;
}
}
private static ulong[][][] Permutations { get; }
private static long[][] PermutationsNumbers { get; }
public static ulong[] HamiltonianPermutations(int n, uint filter) => Permutations[n][filter];
public static long GetPermutationsNumber(int n, uint filter) => PermutationsNumbers[n][filter];
private static Tuple<ulong[][], long[]> GeneratePermutations(int n)
PermutationsGenerator.HamiltonianPermutations(0).ToArray(),
PermutationsGenerator.HamiltonianPermutations(1).ToArray(),
PermutationsGenerator.HamiltonianPermutations(2).ToArray(),
PermutationsGenerator.HamiltonianPermutations(3).ToArray(),
PermutationsGenerator.HamiltonianPermutations(4).ToArray(),
PermutationsGenerator.HamiltonianPermutations(5).ToArray(),
PermutationsGenerator.HamiltonianPermutations(6).ToArray(),
PermutationsGenerator.HamiltonianPermutations(7).ToArray(),
};
public static IEnumerable<PermutationsGenerator.Permutation> HamiltonianPermutations(int n)
{
if (n == 0)
if (n > 9)
{
return Tuple.Create(new ulong[0][], new long[0]);
}
var allPermutations = PermutationsGenerator.HamiltonianPermutations(n)
.Select(FormatPermutation)
.ToArray();
var statesCount = (uint)1 << (n - 1);
var resultUnpadded = new PermutationInfo[statesCount][];
resultUnpadded[0] = allPermutations;
for (uint i = 1; i < statesCount; i++)
{
var mask = i;
mask |= mask >> 1;
mask |= mask >> 2;
mask |= mask >> 4;
mask |= mask >> 8;
mask |= mask >> 16;
mask = mask >> 1;
var existing = i & mask;
var seniorBit = i ^ existing;
var position = 0;
while (seniorBit != 0)
{
seniorBit = seniorBit >> 1;
position++;
}
resultUnpadded[i] = resultUnpadded[existing]
.Where(info => ((info.PermutationInverse >> (4 * (position - 1))) % 16 < (info.PermutationInverse >> (4 * position)) % 16))
.ToArray();
return PermutationsGenerator.HamiltonianPermutations(n);
}
var result = new ulong[statesCount][];
var numbers = new long[statesCount];
for (uint i = 0; i < statesCount; i++)
{
result[i] = PadToWholeChunks(resultUnpadded[i], Constants.PhrasesPerSet);
numbers[i] = resultUnpadded[i].LongLength;
}
return Tuple.Create(result, numbers);
}
public static bool IsOrderPreserved(ulong permutation, uint position)
{
var currentPermutation = permutation;
while (currentPermutation != 0)
{
if ((currentPermutation & 15) == position)
{
return true;
}
if ((currentPermutation & 15) == (position + 1))
{
return false;
}
currentPermutation = currentPermutation >> 4;
}
throw new ApplicationException("Malformed permutation " + permutation + " for position " + position);
}
private static ulong[] PadToWholeChunks(PermutationInfo[] original, int chunkSize)
{
ulong[] result;
if (original.Length % chunkSize == 0)
{
result = new ulong[original.Length];
}
else
{
result = new ulong[original.Length + chunkSize - (original.Length % chunkSize)];
}
for (var i = 0; i < original.Length; i++)
{
result[i] = original[i].Permutation;
}
return result;
}
private static PermutationInfo FormatPermutation(PermutationsGenerator.Permutation permutation)
{
System.Diagnostics.Debug.Assert(permutation.PermutationData.Length <= 16);
ulong result = 0;
ulong resultInverse = 0;
for (var i = 0; i < permutation.PermutationData.Length; i++)
{
var source = i;
var target = permutation.PermutationData[i];
result |= (ulong)(target) << (4 * source);
resultInverse |= (ulong)(source) << (4 * target);
}
return new PermutationInfo { Permutation = result, PermutationInverse = resultInverse };
}
private static IEnumerable<long> GeneratePermutationsNumbers()
{
long result = 1;
yield return result;
var i = 1;
while (true)
{
result *= i;
yield return result;
i++;
}
}
private struct PermutationInfo
{
public ulong Permutation;
public ulong PermutationInverse;
return Permutations[n];
}
}
}

@ -1,13 +1,12 @@
namespace WhiteRabbit
{
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Configuration;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Numerics;
using System.Security.Cryptography;
using System.Text;
/// <summary>
@ -24,18 +23,13 @@
stopwatch.Start();
var sourcePhrase = ConfigurationManager.AppSettings["SourcePhrase"];
var sourceChars = ToOrderedChars(sourcePhrase);
var maxWordsInPhrase = int.Parse(ConfigurationManager.AppSettings["MaxWordsInPhrase"]);
if (sourcePhrase.Where(ch => ch != ' ').Count() + maxWordsInPhrase > 28)
if (sourceChars.Length + maxWordsInPhrase > 27)
{
Console.WriteLine("Only anagrams of up to 27 characters (including whitespace) are allowed");
return;
}
if (maxWordsInPhrase > Constants.MaxNumberOfWords)
{
Console.WriteLine($"Only anagrams of up to {Constants.MaxNumberOfWords} words are allowed");
Console.WriteLine("Only anagrams of up to 27 characters are allowed");
return;
}
@ -50,16 +44,14 @@
Console.WriteLine("Only 64-bit systems are supported due to MD5Digest optimizations");
}
var expectedHashesFirstComponentsArray = new uint[8];
{
int i = 0;
foreach (var expectedHash in ConfigurationManager.AppSettings["ExpectedHashes"].Split(','))
{
expectedHashesFirstComponentsArray[i] = HexadecimalStringToUnsignedIntArray(expectedHash)[0];
expectedHashesFirstComponentsArray[i + 1] = HexadecimalStringToUnsignedIntArray(expectedHash)[0];
i += 2;
}
}
var expectedHashesAsVectors = ConfigurationManager.AppSettings["ExpectedHashes"]
.Split(',')
.Select(hash => new Vector<uint>(HexadecimalStringToUnsignedIntArray(hash)))
.ToArray();
#if DEBUG
var anagramsBag = new ConcurrentBag<string>();
#endif
var processor = new StringsProcessor(
Encoding.ASCII.GetBytes(sourcePhrase),
@ -68,21 +60,46 @@
Console.WriteLine($"Initialization complete; time from start: {stopwatch.Elapsed}");
stopwatch.Restart();
processor.GeneratePhrases()
.ForAll(phraseBytes =>
{
Debug.Assert(
sourceChars == ToOrderedChars(ToString(phraseBytes)),
$"StringsProcessor produced incorrect anagram: {ToString(phraseBytes)}");
var hashVector = ComputeHashVector(phraseBytes);
if (Array.IndexOf(expectedHashesAsVectors, hashVector) >= 0)
{
var phrase = ToString(phraseBytes);
var hash = VectorToHexadecimalString(hashVector);
Console.WriteLine($"Found phrase for {hash}: {phrase}; time from start is {stopwatch.Elapsed}");
}
#if DEBUG
var fastPhrasesCount = processor.GetPhrasesCount();
Console.WriteLine($"Number of phrases: {fastPhrasesCount}; time from start: {stopwatch.Elapsed}");
anagramsBag.Add(ToString(phraseBytes));
#endif
});
stopwatch.Restart();
Console.WriteLine($"Done; time from start: {stopwatch.Elapsed}");
#if DEBUG
var anagramsArray = anagramsBag.ToArray();
var anagramsSet = new HashSet<string>(anagramsArray);
Array.Sort(anagramsArray);
processor.CheckPhrases(expectedHashesFirstComponentsArray, (phraseBytes, hashFirstComponent) =>
Console.WriteLine("All anagrams:");
for (var i = 0; i < anagramsArray.Length; i++)
{
var phrase = Encoding.ASCII.GetString(phraseBytes);
var hash = ComputeFullMD5(phraseBytes);
Console.WriteLine($"Found phrase for {hash} ({hashFirstComponent:x8}): {phrase}; time from start is {stopwatch.Elapsed}");
});
Console.WriteLine(anagramsArray[i]);
}
Console.WriteLine($"Done; time from start: {stopwatch.Elapsed}");
// Duplicate anagrams are expected, as e.g. "norway spoils tut tut" will be taken twice:
// as "norway1 spoils2 tut3 tut4" and "norway1 spoils2 tut4 tut3"
// (in addition to e.g. "norway1 tut3 spoils2 tut4")
Console.WriteLine($"Total anagrams count: {anagramsArray.Length}; unique anagrams: {anagramsSet.Count}; time from start: {stopwatch.Elapsed}");
#endif
}
// Code taken from http://stackoverflow.com/a/321404/831314
@ -95,14 +112,19 @@
.ToArray();
}
// We can afford to spend some time here; this code will only run for matched phrases (and for one in several billion non-matched)
private static string ComputeFullMD5(byte[] phraseBytes)
// Bouncy Castle is used instead of standard .NET methods for performance reasons
private static Vector<uint> ComputeHashVector(Phrase input)
{
using (var hashAlgorithm = new MD5CryptoServiceProvider())
{
var resultBytes = hashAlgorithm.ComputeHash(phraseBytes);
return string.Concat(resultBytes.Select(b => b.ToString("x2")));
}
return new Vector<uint>(MD5Digest.Compute(input));
}
private static string VectorToHexadecimalString(Vector<uint> hash)
{
var components = Enumerable.Range(0, 4)
.Select(i => hash[i].ToString("x8"))
.Select(ChangeEndianness);
return string.Concat(components);
}
private static string ChangeEndianness(string hex)
@ -110,6 +132,11 @@
return hex.Substring(6, 2) + hex.Substring(4, 2) + hex.Substring(2, 2) + hex.Substring(0, 2);
}
private static string ToString(Phrase phrase)
{
return Encoding.ASCII.GetString(phrase.GetBytes());
}
private static IEnumerable<byte[]> ReadInput()
{
string line;
@ -118,5 +145,20 @@
yield return Encoding.ASCII.GetBytes(line);
}
}
private static string ToOrderedChars(string source)
{
return new string(source.Where(ch => ch != ' ').OrderBy(ch => ch).ToArray());
}
#if SINGLE_THREADED
private static void ForAll<T>(this IEnumerable<T> source, Action<T> action)
{
foreach (var entry in source)
{
action(entry);
}
}
#endif
}
}

@ -3,8 +3,6 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
using System.Threading.Tasks;
internal sealed class StringsProcessor
{
@ -13,7 +11,7 @@
// Ensure that permutations are precomputed prior to main run, so that processing times will be correct
static StringsProcessor()
{
PrecomputedPermutationsGenerator.HamiltonianPermutations(1, 0);
PrecomputedPermutationsGenerator.HamiltonianPermutations(0);
}
public StringsProcessor(byte[] sourceString, int maxWordsCount, IEnumerable<byte[]> words)
@ -22,26 +20,18 @@
this.NumberOfCharacters = filteredSource.Length;
this.VectorsConverter = new VectorsConverter(filteredSource);
var allWordsAndVectors = words
// Dictionary of vectors to array of words represented by this vector
var vectorsToWords = words
.Where(word => word != null && word.Length > 0)
.Select(word => new { word, vector = this.VectorsConverter.GetVector(word) })
.Select(word => new { word = word.Concat(new byte[] { SPACE }).ToArray(), vector = this.VectorsConverter.GetVector(word) })
.Where(tuple => tuple.vector != null)
.Select(tuple => tuple.word)
.Distinct(new ByteArrayEqualityComparer())
.Select(word => word)
.ToArray();
// Dictionary of vectors to array of words represented by this vector
var vectorsToWords = allWordsAndVectors
.Select((word, index) => new { word, index, vector = this.VectorsConverter.GetVector(word).Value })
.Select(tuple => new { tuple.word, vector = tuple.vector.Value })
.GroupBy(tuple => tuple.vector)
.Select(group => new { vector = group.Key, words = group.Select(tuple => tuple.index).ToArray() })
.Select(group => new { vector = group.Key, words = group.Select(tuple => tuple.word).Distinct(new ByteArrayEqualityComparer()).ToArray() })
.ToList();
this.WordsDictionary = vectorsToWords.Select(tuple => tuple.words).ToArray();
this.AllWords = allWordsAndVectors.Select(word => new Word(word)).ToArray();
this.VectorsProcessor = new VectorsProcessor(
this.VectorsConverter.GetVector(filteredSource).Value,
maxWordsCount,
@ -50,56 +40,35 @@
private VectorsConverter VectorsConverter { get; }
private Word[] AllWords { get; }
/// <summary>
/// WordsDictionary[vectorIndex] = [word1index, word2index, ...]
/// WordsDictionary[vectorIndex] = [word1, word2, ...]
/// </summary>
private int[][] WordsDictionary { get; }
private byte[][][] WordsDictionary { get; }
private VectorsProcessor VectorsProcessor { get; }
private int NumberOfCharacters { get; }
public void CheckPhrases(uint[] expectedHashesVector, Action<byte[], uint> action)
#if SINGLE_THREADED
public IEnumerable<byte[]> GeneratePhrases()
#else
public ParallelQuery<Phrase> GeneratePhrases()
#endif
{
// task of finding anagrams could be reduced to the task of finding sequences of dictionary vectors with the target sum
var sums = this.VectorsProcessor.GenerateSequences();
// converting sequences of vectors to the sequences of words...
Parallel.ForEach(sums, new ParallelOptions { MaxDegreeOfParallelism = Constants.NumberOfThreads }, sum => ProcessSum(sum, expectedHashesVector, action));
return sums
.Select(this.ConvertVectorsToWords)
.SelectMany(Flattener.Flatten)
.SelectMany(this.ConvertWordsToPhrases);
}
public long GetPhrasesCount()
{
var sums = this.VectorsProcessor.GenerateSequences();
return (from sum in sums
let filter = ComputeFilter(sum)
let wordsVariantsNumber = this.ConvertVectorsToWordsNumber(sum)
let permutationsNumber = PrecomputedPermutationsGenerator.GetPermutationsNumber(sum.Length, filter)
let total = wordsVariantsNumber * permutationsNumber
select total)
.Sum();
}
private static uint ComputeFilter(int[] vectors)
{
uint result = 0;
for (var i = 1; i < vectors.Length; i++)
{
if (vectors[i] == vectors[i - 1])
{
result |= (uint)1 << (i - 1);
}
}
return result;
}
private int[][] ConvertVectorsToWordIndexes(int[] vectors)
private byte[][][] ConvertVectorsToWords(int[] vectors)
{
var length = vectors.Length;
var words = new int[length][];
var words = new byte[length][][];
for (var i = 0; i < length; i++)
{
words[i] = this.WordsDictionary[vectors[i]];
@ -108,35 +77,11 @@
return words;
}
private long ConvertVectorsToWordsNumber(int[] vectors)
{
long result = 1;
for (var i = 0; i < vectors.Length; i++)
{
result *= this.WordsDictionary[vectors[i]].Length;
}
return result;
}
private void ProcessSum(int[] sum, uint[] expectedHashesVector, Action<byte[], uint> action)
private IEnumerable<Phrase> ConvertWordsToPhrases(byte[][] words)
{
var initialPhraseSet = new PhraseSet();
initialPhraseSet.Init();
initialPhraseSet.FillLength(this.NumberOfCharacters, sum.Length);
var phraseSet = new PhraseSet();
phraseSet.Init();
var permutationsFilter = ComputeFilter(sum);
var wordsVariants = this.ConvertVectorsToWordIndexes(sum);
foreach (var wordsArray in Flattener.Flatten(wordsVariants))
foreach (var permutation in PrecomputedPermutationsGenerator.HamiltonianPermutations(words.Length))
{
phraseSet.ProcessPermutations(
initialPhraseSet,
this.AllWords,
wordsArray,
PrecomputedPermutationsGenerator.HamiltonianPermutations(wordsArray.Length, permutationsFilter),
expectedHashesVector,
action);
yield return new Phrase(words, permutation, this.NumberOfCharacters);
}
}
}

@ -22,20 +22,6 @@
this.MaxVectorsCount = maxVectorsCount;
this.Dictionary = ImmutableArray.Create(FilterVectors(dictionary, target).ToArray());
var normsIndex = new int[GetVectorNorm(target, target) + 1];
var offset = 0;
for (var i = normsIndex.Length - 1; i >= 0; i--)
{
while (offset < this.Dictionary.Length && this.Dictionary[offset].Norm > i)
{
offset++;
}
normsIndex[i] = offset;
}
this.NormsIndex = ImmutableArray.Create(normsIndex);
}
private Vector<byte> Target { get; }
@ -44,13 +30,17 @@
private ImmutableArray<VectorInfo> Dictionary { get; }
// Stores index of the first vector from Dictionary with norm less than or equal to offset
private ImmutableArray<int> NormsIndex { get; }
// Produces all sets of vectors with the target sum
#if SINGLE_THREADED
public IEnumerable<int[]> GenerateSequences()
#else
public ParallelQuery<int[]> GenerateSequences()
#endif
{
return this.GenerateUnorderedSequences(this.Target, GetVectorNorm(this.Target, this.Target), this.MaxVectorsCount, 0)
return GenerateUnorderedSequences(this.Target, GetVectorNorm(this.Target, this.Target), this.MaxVectorsCount, this.Dictionary, 0)
#if !SINGLE_THREADED
.AsParallel()
#endif
.Select(Enumerable.ToArray);
}
@ -84,7 +74,7 @@
// In every sequence, next vector always goes after the previous one from dictionary.
// E.g. if dictionary is [x, y, z], then only [x, y] sequence could be generated, and [y, x] will never be generated.
// That way, the complexity of search goes down by a factor of MaxVectorsCount! (as if [x, y] does not add up to a required target, there is no point in checking [y, x])
private IEnumerable<ImmutableStack<int>> GenerateUnorderedSequences(Vector<byte> remainder, int remainderNorm, int allowedRemainingWords, int currentDictionaryPosition)
private static IEnumerable<ImmutableStack<int>> GenerateUnorderedSequences(Vector<byte> remainder, int remainderNorm, int allowedRemainingWords, ImmutableArray<VectorInfo> dictionary, int currentDictionaryPosition)
{
if (allowedRemainingWords > 1)
{
@ -94,9 +84,9 @@
// we need the largest remaining word to have a norm of at least 3
var requiredRemainderPerWord = (remainderNorm + allowedRemainingWords - 1) / allowedRemainingWords;
for (var i = Math.Max(this.NormsIndex[remainderNorm], currentDictionaryPosition); i < this.Dictionary.Length; i++)
for (var i = FindFirstWithNormLessOrEqual(remainderNorm, dictionary, currentDictionaryPosition); i < dictionary.Length; i++)
{
var currentVectorInfo = this.Dictionary[i];
var currentVectorInfo = dictionary[i];
if (currentVectorInfo.Vector == remainder)
{
yield return ImmutableStack.Create(currentVectorInfo.Index);
@ -109,7 +99,7 @@
{
var newRemainder = remainder - currentVectorInfo.Vector;
var newRemainderNorm = remainderNorm - currentVectorInfo.Norm;
foreach (var result in this.GenerateUnorderedSequences(newRemainder, newRemainderNorm, newAllowedRemainingWords, i))
foreach (var result in GenerateUnorderedSequences(newRemainder, newRemainderNorm, newAllowedRemainingWords, dictionary, i))
{
yield return result.Push(currentVectorInfo.Index);
}
@ -118,9 +108,9 @@
}
else
{
for (var i = Math.Max(this.NormsIndex[remainderNorm], currentDictionaryPosition); i < this.Dictionary.Length; i++)
for (var i = FindFirstWithNormLessOrEqual(remainderNorm, dictionary, currentDictionaryPosition); i < dictionary.Length; i++)
{
var currentVectorInfo = this.Dictionary[i];
var currentVectorInfo = dictionary[i];
if (currentVectorInfo.Vector == remainder)
{
yield return ImmutableStack.Create(currentVectorInfo.Index);
@ -133,6 +123,41 @@
}
}
// BCL BinarySearch would find any vector with required norm, not the first one; or would find nothing if there is no such vector
private static int FindFirstWithNormLessOrEqual(int expectedNorm, ImmutableArray<VectorInfo> dictionary, int offset)
{
var start = offset;
var end = dictionary.Length - 1;
if (dictionary[start].Norm <= expectedNorm)
{
return start;
}
if (dictionary[end].Norm > expectedNorm)
{
return dictionary.Length;
}
// Norm for start is always greater than expected norm, or start is the required position; norm for end is always less than or equal to expected norm
// The loop always ends, because the difference always decreases; if start + 1 = end, then middle will be equal to start, and either end := middle = start or start := middle + 1 = end.
while (start < end)
{
var middle = (start + end) / 2;
var newNorm = dictionary[middle].Norm;
if (dictionary[middle].Norm <= expectedNorm)
{
end = middle;
}
else
{
start = middle + 1;
}
}
return start;
}
private struct VectorInfo
{
public VectorInfo(Vector<byte> vector, int norm, int index)

@ -9,14 +9,13 @@
<AppDesignerFolder>Properties</AppDesignerFolder>
<RootNamespace>WhiteRabbit</RootNamespace>
<AssemblyName>WhiteRabbit</AssemblyName>
<TargetFrameworkVersion>v4.7</TargetFrameworkVersion>
<TargetFrameworkVersion>v4.6</TargetFrameworkVersion>
<FileAlignment>512</FileAlignment>
<AutoGenerateBindingRedirects>true</AutoGenerateBindingRedirects>
<AllowUnsafeBlocks>true</AllowUnsafeBlocks>
<TargetFrameworkProfile />
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
<PlatformTarget>x64</PlatformTarget>
<PlatformTarget>AnyCPU</PlatformTarget>
<DebugSymbols>true</DebugSymbols>
<DebugType>full</DebugType>
<Optimize>false</Optimize>
@ -28,7 +27,7 @@
<Prefer32Bit>false</Prefer32Bit>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">
<PlatformTarget>x64</PlatformTarget>
<PlatformTarget>AnyCPU</PlatformTarget>
<DebugType>pdbonly</DebugType>
<Optimize>true</Optimize>
<OutputPath>bin\Release\</OutputPath>
@ -59,9 +58,9 @@
</ItemGroup>
<ItemGroup>
<Compile Include="ByteArrayEqualityComparer.cs" />
<Compile Include="Constants.cs" />
<Compile Include="Flattener.cs" />
<Compile Include="PhraseSet.cs" />
<Compile Include="MD5Digest.cs" />
<Compile Include="Phrase.cs" />
<Compile Include="PrecomputedPermutationsGenerator.cs" />
<Compile Include="PermutationsGenerator.cs" />
<Compile Include="StringsProcessor.cs" />
@ -69,18 +68,11 @@
<Compile Include="Properties\AssemblyInfo.cs" />
<Compile Include="VectorsProcessor.cs" />
<Compile Include="VectorsConverter.cs" />
<Compile Include="Word.cs" />
</ItemGroup>
<ItemGroup>
<None Include="App.config" />
<None Include="packages.config" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\WhiteRabbit.UnmanagedBridge\WhiteRabbit.UnmanagedBridge.vcxproj">
<Project>{039f03a0-7e8f-415d-8180-969d24479b44}</Project>
<Name>WhiteRabbit.UnmanagedBridge</Name>
</ProjectReference>
</ItemGroup>
<Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />
<!-- To modify your build process, add your task inside one of the targets below and uncomment it.
Other similar extension points exist, see Microsoft.Common.targets.

@ -1,53 +0,0 @@
namespace WhiteRabbit
{
internal unsafe struct Word
{
public fixed long Buffers[128];
public unsafe Word(byte[] word)
{
var tmpWord = new byte[word.Length + 1];
tmpWord[word.Length] = (byte)' ';
for (var i = 0; i < word.Length; i++)
{
tmpWord[i] = word[i];
}
fixed (long* buffersPointer = this.Buffers)
{
for (var i = 0; i < 32; i++)
{
var bytePointer = (byte*)(buffersPointer + 4 * i);
var endPointer = bytePointer + 32;
var currentPointer = bytePointer + i;
for (var j = 0; j < tmpWord.Length && currentPointer < endPointer; j++, currentPointer++)
{
*currentPointer = tmpWord[j];
}
}
buffersPointer[127] = tmpWord.Length * 4;
}
}
public unsafe byte[] Original
{
get
{
fixed (long* buffersPointer = this.Buffers)
{
var length = buffersPointer[127] / 4;
var result = new byte[length];
for (var i = 0; i < length; i++)
{
result[i] = ((byte*)buffersPointer)[i];
}
return result;
}
}
}
private static Word Empty { get; } = new Word();
}
}
Loading…
Cancel
Save