From 964a157450e156aa25f56ce3bad2402ee434d340 Mon Sep 17 00:00:00 2001 From: Inga Date: Sun, 12 Jan 2025 23:35:18 +0000 Subject: [PATCH] added solution description to README --- README.md | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 93 insertions(+) diff --git a/README.md b/README.md index b26d5fe..ffe7a98 100644 --- a/README.md +++ b/README.md @@ -151,3 +151,96 @@ importance on these criteria, and we strongly encourage you to carefully conside * Efficiency and performance of the api endpoint. * Clarity, readability and testability of the code. * Handling of edge cases and error conditions. + +## Solution + +### How to launch + +You'll need Node.js and npm to launch it, but then the tests that came with the challenge +already required both Node.js and npm. + +Solution is in [`service/`](service/); commands are: + +* `npm ci`, to install dependencies; +* `npm run start`, to typecheck and start the backend + (optional `PORT` and `PG_CONNECTION_STRING` environment variables are supported, + e.g. `PORT=4000 PG_CONNECTION_STRING=/run/postgresql npm run start`; + they default to port and connection string specified in the challenge); +* `npm run lint`, to lint the backend; +* `npm run test`, to run unit and integration tests + (actually there are no useful tests in this backend right now, + because the only interesting thing in it is one huge SQL query); +* `npm run test:e2e`, to run end-to-end tests, + (it too supports optional `PG_CONNECTION_STRING` environment variable) + (there are no useful e2e tests right now as well). + +### Used framework + +I had to spend most of the time on creating an API project and writing boilerplate +rather than actually solving the interesting part. +Since the challenge says that I can use any language or framework for this challenge, +and since it already has tests written in Node.js, +I decided to go with Nest.js (+ TypeScript) because it provides somewhat reasonable defaults, +and allows one to get to actually implementing the interesting part reasonably fast (although still too slow). + +All request validation is done using standard Nest.js approach. +If the database is not available at the startup, it will fail. + +### The main task + +Producing the list of matching slots seems like a task for DB (especially with the given DB structure), +so everything is happening in one huge query in [`service/src/db/index.ts`](service/src/db/index.ts). + +### DB optimization + +The challenge also says that I can add indexes to the existing DB structure, +and that you are going to evaluate the solution based on efficiency and performance. +However, in order to make decisions regarding efficiency and performance, +one has to know the actual use cases, and the actual distribution of the data. +20 rows and vague descriptions in the challenge are simply not enough to guess +how exactly should the solution scale, and in which directions. + +That said, I made several assumptions about the future state of the data, as follows: + +* There will not be a lot of managers, so doing full scan on managers table for every query + (trying to find suitable managers) is not a performance issue. + Even if in the future there will be millions of entries in managers table, + presumably not all of them will actually be active (and presumably we're only interested + in the slots for the future dates), so perhaps adding some kind of `is_active` field + and indexing by it will provide us with better specificity + than indexing by languages / customer ratings / product types. + So I did not add any indexes to the managers table. +* Another related assumption is that there will only be a few active managers + matching language and product type and customer rating at once. +* For slots, I assumed that there will be a lot of slots per manager; + and that old slots are not going to be removed from DB (even the booked ones). + So I added an index on manager id + booked flag + start date, + to allow for fast retrieval of all free slots for a handful of matching managers for the specified date. +* Since the challenge says that each slot corresponds to one-hour appointment, + I assumed that slots can overlap but never be contained in each other. + This allowed me to simplify the check whether `a` overlaps with `b` to just + `(a.start_date <= b.start_date < a.end_date) or (a.start_date < b.end_date <= b.end_date)`, + so in order to check if there are any slots overlapping with `a`, it is enough to just find + all slots with either `start_date` or `end_date` in a certain range. + I added an additional index on manager id + booked flag + end date, meaning that + to answer the question "does this free slot overlap with any booked slots of the same manager, + it is enough to only query two indexes (and ensure that the results are empty), + without having to do a full scan on anything. + +Ultimately, I added two indexes (both defined in [`database/init.sql`](database/init.sql)) +which should allow this solution to scale to large number of slots (but still limited number of managers). + +### Other assumptions + +And finally, assumptions about dates: + +* The challenge does not say anything about timezones, so I decided not to bother with them either. + This would be unacceptable in a production-ready project, but for production-ready project, + one would have at least accept the timezone in request, and not just "YYYY-MM-DD" string + which doesn't mean a lot by itself. + So this solution might contain some timezone-related bugs... but technically they are not bugs, + because they still will not violate any requirements in the challenge. +* The challenge does not say anything about the meaning of the requested date. + So I decided to treat it as specifying the range for the slot start date + (e.g. time slot from 2025-01-12 23:30 to 2025-01-13 00:30 + will be returned for requests for 2025-01-12, but not for 2025-01-13).