Initial draft of threat models and cryptography documents

3 years ago · d74d65335f
parent 7502d6f798
commit d74d65335f
3 changed files with 321 additions and 0 deletions
--- a/docs/goals.md
+++ b/docs/goals.md
@ -0,0 +1,45 @@
+---
+gitea: none
+include_toc: true
+---
+
+# Usage scenarios
+
+## Basic scenario
+
+A person X and person Y know each other online.
+
+Person X feels something towards person Y, but doesn't know if their feelings are returned.
+They don't want to risk misreading the situation and embarassing or pressuring Y or to be perceived as creepy, so they cannot approach Y directly.
+
+What they can do is to visit the proposed website, log their sympathy towards Y and get notified if Y will do the same for them,
+or receive an immediate response if Y already did the same for them.
+
+## Metamour scenario
+
+Additionally, when logging their sympathy towards Y, X should be able to choose if they want to learn about the shared connections.
+
+If X's feelings towards Y are returned,
+_and_ there is an user Z such that their feelings towards both Y and Z are returned,
+_and_ both Y and Z also choose that they want to learn about the shared connections,
+_then_ X should receive an immediate response about that, and both Y and Z should be notified.
+
+# UI requirements
+
+## Mutual sympathies
+
+For every match (for all scenarios), all users involved should receive notifications, if they opted in to receiving these.
+
+Additionally, for every request that ended up creating a match, an user that initiated this request should receive an immediate response about the match.
+
+Additionally, an user should be able to see a list of all the successful matches they're part of.
+
+# Linking social media identities to the sympathies system
+
+TODO: to be written...
+
+# Security, trust
+
+All the features described above should require as little (ideally, no) trust in the admin / maintainer.
+
+For more details, refer to [Threat models](threat-models.md)
--- a/docs/relationships-cryptography.md
+++ b/docs/relationships-cryptography.md
@ -0,0 +1,195 @@
+---
+gitea: none
+include_toc: true
+---
+
+(Also see [Threat models](threat-models.ts))
+
+# Thoughts
+
+## Basic scenario
+
+We should somehow identify when two users have sympathies toward each other.
+
+The easiest way to do this would be to derive some kind of a hash that would be the same regardless of who of the two computed it, and impossible to compute by anybody else.
+In addition, we would need to identify whether two identical versions of that hash were computed by the same user or by different users.
+
+## Metamour scenario
+
+We should somehow identify when three users form a full graph.
+
+The easiest way to do this would be to derive some kind of a hash that would be the same regardless of who of the three computed it, and impossible to compute by anybody else.
+
+In addition, we would need to identify whether three identical versions of that hash were computed by three different users;
+and this should be unforgeable.
+
+This second part could be solved by either storing some additional hashes along the common hash;
+these hashes would have to be different between users and unforgeable
+(i.e. derived on a server side only from user's public key, and data used to obtain the common hash).
+
+Or alternatively it could be solved by some form of secret sharing protocol without parties communicating to each other:
+encrypt some piece of an information with a common public key, attach a part of a common "private" key (only used for that purpose for that triple of users);
+every user should be able to compute a common public key and their part of a common private key; all three parts are required to decrypt a message.
+I am not aware of any protocols that allow to do this.
+
+# Assumptions
+
+Every user has an elliptic curve keypair; server only knows their public keys.
+
+There is an `INSTANCE_ID` which is supposed to uniquely identify the instance of the platform used (it could be an instance URL, for example).
+
+There is also a `SECRET_PADDING` used on the server.
+
+# Used cryptographic primitives
+
+* `sign(private_key, data)`, `verify(public_key, signed_data)`: `verify(public_key, sign(private_key, data)) == true`. Additionally, `sign` output should not leak information about the public key.
+* `encrypt(public_key, data)`, `decrypt(private_key, encrypted_data)`: `decrypt(private_key, encrypt(public_key, data)) == data`. `encrypt` does not have to be stable. Additionally, `encrypt` should not leak information about the public key.
+* `symmetric_encrypt(key, data)`, `symmetric_decrypt(key, encrypted_data)`.
+* `derive_key(data)`: generates a new symmetric encryption key, should be stable (always produce the same result for the same data) and irreversible.
+* `hash(data1, data2, ...)`: should be stable (always produce the same result) and irreversible.
+* `shared_key2(private_key_a, public_key_b)`: `shared_key2(private_key_a, public_key_b) == shared_key2(private_key_b, public_key_a)`. `shared_key2` has to be stable (to always return the same result for the same input data).
+* `shared_key3(private_key_a, public_key_b, public_key_c) == shared_key3(private_key_a, public_key_c, public_key_b) == shared_key3(private_key_b, public_key_a, public_key_c)`. `shared_key3` has to be stable (to always return the same result for the same input data).
+  * It looks like there are no industry standard ways to do this: https://crypto.stackexchange.com/a/1034 , so it is not used below
+
+The following methods are used
+
+* ECDSA for signing (does it leak public key?)
+* ECIES for asymmetric encryption (does it leak public key?)
+* AES-256 for symmetric encryption
+* SHA-256 for hashing (if multiple values are supplied, `sha256(sha256(data1) + sha256(data2) + sha256(data3) + ...)` is used).
+* ECDH for `shared_key2`
+
+All requests to the server are signed with user's private key.
+Server verifies the signature against the supplied public key.
+
+All responses from the server are encrypted with that public key.
+So that the user can decrypt them with their private key.
+
+# Scenarios
+
+## Basic scenario
+
+### Data submitted to the server
+
+If an user X wants to save their sympathy towards Y, they submit the following data to the server:
+
+* `hash(shared_key2(private_X, public_Y), INSTANCE_ID)` (referred to as `common_hash` below)
+* `symmetric_encrypt(derive_key(private_X), metadata)` (referred to as `encrypted_metadata` below), where metadata is only used on the client (e.g. Y's display name for X, creation date, etc).
+
+### Server logic
+
+* Compute the following fields:
+  * `hash(common_hash; SECRET_PADDING_COMMON2)` (referred to as `padded_common_hash` below)
+  * `public_X` (referred to as `public_key` below)
+* Do the rate limiting: check if the number of non-mutual sympathies for that `public_key` is within allowed limit.
+* Check if there is an entry with this `padded_common_hash` but different `public_key` in the table of pending sympathies.
+  * If there is not:
+    * Save a new entry to that table with the following fields: `padded_common_hash`, `public_key`, `encrypted_metadata`, `creation_date`;
+    * Respond with "sympathy registered"
+  * If there is:
+    * Retrieve and remove that entry;
+    * Store its `public_key` and `encrypted_metadata` in a table of mutual sympathies;
+    * Store this request's `public_key` and `encrypted_metadata` in a table of completed sympathies;
+    * Send a notification to the user from that old entry (using its `public_key`);
+    * Respond with "sympathy is mutual"
+
+### Security
+
+#### Malicious API usage
+
+TODO: to be written...
+
+#### Stored data
+
+Prior to match, only `public_X`, `padded_common_hash`, `encrypted_metadata_X`, `creation_date` are stored.
+`encrypted_metadata_X` only exposes any information to the holder of `private_X`, who owns this data anyway.
+`padded_common_hash` only exposes any information to the holder of `private_X` (who owns this data anyway) or `private_Y` _plus_ `SECRET_PADDING_COMMON2`. So in case of DB+secrets leak, Y would be able to learn about X's non-mutual sympathy towards Y.
+`public_X` and `creation_date` expose information about who registered how many non-mutual sympathies and when (in case of DB leak).
+
+After match, two entries are stored: (`public_X`, `encrypted_metadata_X`) and (`public_Y`, `encrypted_metadata_Y`).
+`encrypted_metadata_X` only exposes any information to the holder of `private_X`.
+`public_X` and `public_Y` expose information about who registered how many mutual sympathies.
+A special care should be taken to make sure there is no way to deduce "mutual date" from these entries, especially since they are inserted into DB in roughly the same time, to avoid an attacker with DB access from deducing that two entries are related.
+
+## Metamour scenario
+
+### Data submitted to the server
+
+If an user X wants to also learn about the shared connections when registering a sympathy, in addition to the two fields above they also submit the following data to the server:
+* For every mutual and non-mutual sympathy towards every user Z:
+  * `hash(shared_key2(private_X, public_Y), public_Z, INSTANCE_ID)` (referred to as `common_XY_Z` below);
+  * `common_XZ_Y`
+  * `encrypted_metadata_X_Y` (with an information about Y and Z)
+  * `encrypted_metadata_X_Z` (encrypted with a different nonce, to avoid matching the two)
+
+Additionally, they store all such `public_Z` in the `encrypted_metadata` field for their pending sympathy towards Y.
+
+Note that `common_XY_Z == common_YX_Z`
+
+### Server logic
+
+* For every entry in the list:
+  * Compute the following fields:
+    * `hash(common_XY_Z, SECRET_PADDING_COMMON3)` (referred to as `padded_common_XY_Z` below; note that `padded_common_XY_Z == padded_common_YX_Z`)
+    * `padded_common_XZ_Y`
+    * `hash(common_XY_Z, public_X, SECRET_PADDING_ID)` (referred to as `id_X_Y_Z` below; note that `id_X_Y_Z` is different from any other permutation such as `id_Y_X_Z`)
+    * `id_X_Z_Y`
+    * `symmetric_encrypt(derive_key(hash(common_XY_Z, SECRET_PADDING_KEY)), [common_XZ_Y, public_X])` (referred to as `encrypted_data_XY_Z` below)
+    * `encrypted_data_XY_Z`
+  * Check how many entries are there with the same `padded_common` values but different `id` values in the table of pending metamour sympathies:
+    * If there is not one of each (for the total of two):
+      * Add the following two entries to the table:
+        * `padded_common_XY_Z`, `id_X_Y_Z`, `encrypted_data_XY_Z`, `encrypted_metadata_X_Y`
+        * `padded_common_XZ_Y`, `id_X_Z_Y`, `encrypted_data_XZ_Y`, `encrypted_metadata_X_Z`
+      * Respond with `sympathy registered`
+    * If there is one of each:
+      * Retrieve them and remove them from the table of pending metamour sympathies
+      * Decrypt the `encrypted_data` field, obtain the third remaining `padded_common` value plus both remaining public keys
+      * Remove two entries for the third `padded_common` value
+      * Add the following three entries to the table of completed metamour sympathies:
+        * `public_X`, `encrypted_metadata_X` (any of the two)
+        * `public_Y` (obtained by decrypting `encrypted_data`), `encrypted_metadata_Y` (obtained from the previously existing entry)
+        * `public_Z`, `encrypted_metadata_Z`
+      * Send notifications to Y and Z
+      * Respond with `sympathy mutual`
+
+### Explanation
+
+TODO: To be written...
+
+### Security
+
+TODO: To be written...
+
+## Managing the existing sympathies
+
+### Viewing
+
+TODO: To be written...
+
+### Removing pending
+
+TODO: To be written...
+
+### Removing all data
+
+TODO: To be written...
+
+## Managing the existing metamour sympathies
+
+### Viewing
+
+TODO: To be written...
+
+#### Caveat
+
+### Removing pending
+
+TODO: To be written...
+
+#### Caveat
+
+### Removing all data
+
+TODO: To be written...
+
--- a/docs/threat-models.md
+++ b/docs/threat-models.md
@ -0,0 +1,81 @@
+---
+gitea: none
+include_toc: true
+---
+
+# Goal
+
+To require users to have as little trust into this system as possible.
+
+To reduce the risk of anybody (including an admin / maintainer) getting anybody else's PI as much as possible.
+
+Additionally, the server should never under any circumstances handle any user private keys (including shared keys),
+they should all stay on the client.
+
+And all data should be scoped to this instance of the system (all hashes should be salted with its key),
+to make it unusable on other instances.
+
+# Threats
+
+## Incorrect use of the system
+
+No use of the system API by attackers should expose anybody else's data.
+
+All requests should be signed.
+
+### Brute-forcing sympathies
+
+One of the attacks of that kind would be an user submitting their sympathies to literally everybody else,
+in order to extract all the other user's data of the kind "do they like me",
+which would defeat the purpose of this system.
+
+One way to combat this would be to introduce rate limiting,
+so that every user can only have no more than a fixed amount of non-mutual sympathies at any given moment,
+and so that they will only be able to remove a non-mutual sympathy after at least a fixed amount of time has passed.
+
+For example, that could be at most ten non-mutual sympathies, and at least a month until a non-mutual sympathy can be removed, freeing one of the ten slots.
+
+## MITM attacks
+
+Should we really be concerned about these, if both the client front-end and API are served over HTTPS?
+
+## Database leaks
+
+A leaked database should expose no identifying information.
+
+## Database + system keys leaks
+
+A leaked database, even if it leaked with all the private keys used by the system, should expose as little information as possible.
+
+Exposing information of a kind "this user has N non-mutual sympathies that were created on these dates" is probably unavoidable:
+the system has to keep track of that information in order to prevent brute-forcing.
+
+Exposing information of a kind "this user logged a sympathy towards you" to someone with their own private key is unavoidable too:
+they can just emulate the entire system in a sandbox and brute-force that information.
+
+No other information should be exposed.
+
+## Log leaks
+
+The system should not store any logs.
+
+## Evil admin / maintainer
+
+### Access to the database and system keys
+
+This threat is identical to database + system keys leak.
+
+### Backdoors in the code
+
+Since all requests to API are signed, that means an admin, if they inserted some kinds of backdoors, can always know what users have been sending what kinds of requests.
+
+No other additional information should be exposed if admin does not have an access to someone's private key.
+
+### Colluding with an user
+
+If an admin colludes with some user (who provides their private key), they will be able to obtain all the information that concerns this user,
+if just by brute-forcing it.
+
+We should think about how to make this as painful as possible.
+
+No other additional information should be exposed.