added solution description to README

5 days ago · 964a157450
parent 2bcfe8f875
commit 964a157450
1 changed files with 93 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -151,3 +151,96 @@ importance on these criteria, and we strongly encourage you to carefully conside
 * Efficiency and performance of the api endpoint.
 * Clarity, readability and testability of the code.
 * Handling of edge cases and error conditions.
 ## Solution
 ### How to launch
 You'll need Node.js and npm to launch it, but then the tests that came with the challenge
 already required both Node.js and npm.
 Solution is in [`service/`](service/); commands are:
 * `npm ci`, to install dependencies;
 * `npm run start`, to typecheck and start the backend
  (optional `PORT` and `PG_CONNECTION_STRING` environment variables are supported,
  e.g. `PORT=4000 PG_CONNECTION_STRING=/run/postgresql npm run start`;
  they default to port and connection string specified in the challenge);
 * `npm run lint`, to lint the backend;
 * `npm run test`, to run unit and integration tests
  (actually there are no useful tests in this backend right now,
  because the only interesting thing in it is one huge SQL query);
 * `npm run test:e2e`, to run end-to-end tests,
  (it too supports optional `PG_CONNECTION_STRING` environment variable)
  (there are no useful e2e tests right now as well).
 ### Used framework
 I had to spend most of the time on creating an API project and writing boilerplate
 rather than actually solving the interesting part.
 Since the challenge says that I can use any language or framework for this challenge,
 and since it already has tests written in Node.js,
 I decided to go with Nest.js (+ TypeScript) because it provides somewhat reasonable defaults,
 and allows one to get to actually implementing the interesting part reasonably fast (although still too slow).
 All request validation is done using standard Nest.js approach.
 If the database is not available at the startup, it will fail.
 ### The main task
 Producing the list of matching slots seems like a task for DB (especially with the given DB structure),
 so everything is happening in one huge query in [`service/src/db/index.ts`](service/src/db/index.ts).
 ### DB optimization
 The challenge also says that I can add indexes to the existing DB structure,
 and that you are going to evaluate the solution based on efficiency and performance.
 However, in order to make decisions regarding efficiency and performance,
 one has to know the actual use cases, and the actual distribution of the data.
 20 rows and vague descriptions in the challenge are simply not enough to guess
 how exactly should the solution scale, and in which directions.
 That said, I made several assumptions about the future state of the data, as follows:
 * There will not be a lot of managers, so doing full scan on managers table for every query
  (trying to find suitable managers) is not a performance issue.
  Even if in the future there will be millions of entries in managers table,
  presumably not all of them will actually be active (and presumably we're only interested
  in the slots for the future dates), so perhaps adding some kind of `is_active` field
  and indexing by it will provide us with better specificity
  than indexing by languages / customer ratings / product types.
  So I did not add any indexes to the managers table.
 * Another related assumption is that there will only be a few active managers
  matching language and product type and customer rating at once.
 * For slots, I assumed that there will be a lot of slots per manager;
  and that old slots are not going to be removed from DB (even the booked ones).
  So I added an index on manager id + booked flag + start date,
  to allow for fast retrieval of all free slots for a handful of matching managers for the specified date.
 * Since the challenge says that each slot corresponds to one-hour appointment,
  I assumed that slots can overlap but never be contained in each other.
  This allowed me to simplify the check whether `a` overlaps with `b` to just
  `(a.start_date <= b.start_date < a.end_date) or (a.start_date < b.end_date <= b.end_date)`,
  so in order to check if there are any slots overlapping with `a`, it is enough to just find
  all slots with either `start_date` or `end_date` in a certain range.
  I added an additional index on manager id + booked flag + end date, meaning that
  to answer the question "does this free slot overlap with any booked slots of the same manager,
  it is enough to only query two indexes (and ensure that the results are empty),
  without having to do a full scan on anything.
 Ultimately, I added two indexes (both defined in [`database/init.sql`](database/init.sql))
 which should allow this solution to scale to large number of slots (but still limited number of managers).
 ### Other assumptions
 And finally, assumptions about dates:
 * The challenge does not say anything about timezones, so I decided not to bother with them either.
  This would be unacceptable in a production-ready project, but for production-ready project,
  one would have at least accept the timezone in request, and not just "YYYY-MM-DD" string
  which doesn't mean a lot by itself.
  So this solution might contain some timezone-related bugs... but technically they are not bugs,
  because they still will not violate any requirements in the challenge.
 * The challenge does not say anything about the meaning of the requested date.
  So I decided to treat it as specifying the range for the slot start date
  (e.g. time slot from 2025-01-12 23:30 to 2025-01-13 00:30
  will be returned for requests for 2025-01-12, but not for 2025-01-13).