added solution description to README

5 days ago · 964a157450
parent 2bcfe8f875
commit 964a157450
1 changed files with 93 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -151,3 +151,96 @@ importance on these criteria, and we strongly encourage you to carefully conside
 * Efficiency and performance of the api endpoint.
 * Clarity, readability and testability of the code.
 * Handling of edge cases and error conditions.
+
+## Solution
+
+### How to launch
+
+You'll need Node.js and npm to launch it, but then the tests that came with the challenge
+already required both Node.js and npm.
+
+Solution is in [`service/`](service/); commands are:
+
+* `npm ci`, to install dependencies;
+* `npm run start`, to typecheck and start the backend
+  (optional `PORT` and `PG_CONNECTION_STRING` environment variables are supported,
+  e.g. `PORT=4000 PG_CONNECTION_STRING=/run/postgresql npm run start`;
+  they default to port and connection string specified in the challenge);
+* `npm run lint`, to lint the backend;
+* `npm run test`, to run unit and integration tests
+  (actually there are no useful tests in this backend right now,
+  because the only interesting thing in it is one huge SQL query);
+* `npm run test:e2e`, to run end-to-end tests,
+  (it too supports optional `PG_CONNECTION_STRING` environment variable)
+  (there are no useful e2e tests right now as well).
+
+### Used framework
+
+I had to spend most of the time on creating an API project and writing boilerplate
+rather than actually solving the interesting part.
+Since the challenge says that I can use any language or framework for this challenge,
+and since it already has tests written in Node.js,
+I decided to go with Nest.js (+ TypeScript) because it provides somewhat reasonable defaults,
+and allows one to get to actually implementing the interesting part reasonably fast (although still too slow).
+
+All request validation is done using standard Nest.js approach.
+If the database is not available at the startup, it will fail.
+
+### The main task
+
+Producing the list of matching slots seems like a task for DB (especially with the given DB structure),
+so everything is happening in one huge query in [`service/src/db/index.ts`](service/src/db/index.ts).
+
+### DB optimization
+
+The challenge also says that I can add indexes to the existing DB structure,
+and that you are going to evaluate the solution based on efficiency and performance.
+However, in order to make decisions regarding efficiency and performance,
+one has to know the actual use cases, and the actual distribution of the data.
+20 rows and vague descriptions in the challenge are simply not enough to guess
+how exactly should the solution scale, and in which directions.
+
+That said, I made several assumptions about the future state of the data, as follows:
+
+* There will not be a lot of managers, so doing full scan on managers table for every query
+  (trying to find suitable managers) is not a performance issue.
+  Even if in the future there will be millions of entries in managers table,
+  presumably not all of them will actually be active (and presumably we're only interested
+  in the slots for the future dates), so perhaps adding some kind of `is_active` field
+  and indexing by it will provide us with better specificity
+  than indexing by languages / customer ratings / product types.
+  So I did not add any indexes to the managers table.
+* Another related assumption is that there will only be a few active managers
+  matching language and product type and customer rating at once.
+* For slots, I assumed that there will be a lot of slots per manager;
+  and that old slots are not going to be removed from DB (even the booked ones).
+  So I added an index on manager id + booked flag + start date,
+  to allow for fast retrieval of all free slots for a handful of matching managers for the specified date.
+* Since the challenge says that each slot corresponds to one-hour appointment,
+  I assumed that slots can overlap but never be contained in each other.
+  This allowed me to simplify the check whether `a` overlaps with `b` to just
+  `(a.start_date <= b.start_date < a.end_date) or (a.start_date < b.end_date <= b.end_date)`,
+  so in order to check if there are any slots overlapping with `a`, it is enough to just find
+  all slots with either `start_date` or `end_date` in a certain range.
+  I added an additional index on manager id + booked flag + end date, meaning that
+  to answer the question "does this free slot overlap with any booked slots of the same manager,
+  it is enough to only query two indexes (and ensure that the results are empty),
+  without having to do a full scan on anything.
+
+Ultimately, I added two indexes (both defined in [`database/init.sql`](database/init.sql))
+which should allow this solution to scale to large number of slots (but still limited number of managers).
+
+### Other assumptions
+
+And finally, assumptions about dates:
+
+* The challenge does not say anything about timezones, so I decided not to bother with them either.
+  This would be unacceptable in a production-ready project, but for production-ready project,
+  one would have at least accept the timezone in request, and not just "YYYY-MM-DD" string
+  which doesn't mean a lot by itself.
+  So this solution might contain some timezone-related bugs... but technically they are not bugs,
+  because they still will not violate any requirements in the challenge.
+* The challenge does not say anything about the meaning of the requested date.
+  So I decided to treat it as specifying the range for the slot start date
+  (e.g. time slot from 2025-01-12 23:30 to 2025-01-13 00:30
+  will be returned for requests for 2025-01-12, but not for 2025-01-13).