|
|
|
|
|
|
|
|
|
# Assignment
|
|
|
|
|
|
|
|
|
|
This assignment is designed to take a day's worth of work (8 hours).
|
|
|
|
|
if you find that some things are not possible to complete in that timeframe,
|
|
|
|
|
that's okay – prioritize as you see fit.
|
|
|
|
|
|
|
|
|
|
## Datawrapper Test Project: Backend
|
|
|
|
|
### Overview
|
|
|
|
|
|
|
|
|
|
The project is to create a web service that takes screenshots of websites.
|
|
|
|
|
|
|
|
|
|
The service should expose an API that a client can use to create a new
|
|
|
|
|
screenshot request, with any given URL as a parameter.
|
|
|
|
|
The web service should then return a status URL that the client can
|
|
|
|
|
periodically call to receive updates on the status on its request.
|
|
|
|
|
The web service should take a screenshot of the provided URL.
|
|
|
|
|
Once the screenshot was taken, it should be possible for the client to retrieve it.
|
|
|
|
|
The service should be designed in a scalable fashion, so that it can handle
|
|
|
|
|
varying amounts of parallel requests in a stable and efficient fashion.
|
|
|
|
|
|
|
|
|
|
The API notation can be defined as you see fit.
|
|
|
|
|
|
|
|
|
|
### Technology stack
|
|
|
|
|
|
|
|
|
|
* Server-side code should be written in JavaScript or TypeScript
|
|
|
|
|
* It should be possible to run the application locally (ideally as a
|
|
|
|
|
containerized application, but other technologies are allowed as well)
|
|
|
|
|
* Any other tools, technologies or frameworks can be used at your discretion
|
|
|
|
|
* A hosted version of the application to test is a plus, but not a must
|
|
|
|
|
|
|
|
|
|
### Delivery
|
|
|
|
|
|
|
|
|
|
The project should be delivered as a Github repository.
|
|
|
|
|
It should be possible to run the service locally.
|
|
|
|
|
|
|
|
|
|
# Solution
|
|
|
|
|
|
|
|
|
|
## Service boilerplate
|
|
|
|
|
|
|
|
|
|
I'd say that this is a very odd test assignment, since it requires one to
|
|
|
|
|
create an entirely new service, which involves a lot of boilerplate, and takes
|
|
|
|
|
most of the allotted time.
|
|
|
|
|
This assignment effectively tests whether the applicant is capable of a new
|
|
|
|
|
service on their own, from scratch, which is (hopefully) very unrepresentative
|
|
|
|
|
of the actual job responsibilities; typically, companies have established
|
|
|
|
|
procedures for creating new services, often they have templates and starters,
|
|
|
|
|
and they definitely have established stack and dependencies.
|
|
|
|
|
|
|
|
|
|
At my current company we're using `tsoa` for microservices, with our own
|
|
|
|
|
templates and starters.
|
|
|
|
|
However, creating a new tsoa project from scratch, without using these
|
|
|
|
|
templates, and without looking at the code belonging to the company,
|
|
|
|
|
would demand a lot of time.
|
|
|
|
|
So I decided to search for an alternative, and settled on NestJS for the
|
|
|
|
|
purpose of this assignment.
|
|
|
|
|
|
|
|
|
|
Note that this is my first experience with NestJS, so I may violate some best
|
|
|
|
|
practices, just because I'm not aware of them.
|
|
|
|
|
NestJS also turned out to be surprisingly powerful for creating small
|
|
|
|
|
proof-of-concept services.
|
|
|
|
|
|
|
|
|
|
## Scalability
|
|
|
|
|
|
|
|
|
|
The requirement that the service should be scalable and handle high load calls
|
|
|
|
|
for an obvious solution: queues.
|
|
|
|
|
If this was an actual service and not an MVP, the architecture would probably
|
|
|
|
|
look like this:
|
|
|
|
|
* Public service accepts a screenshotting job, stores it in queue;
|
|
|
|
|
* A number of private workers (serverless functions, or actual private services)
|
|
|
|
|
pick up jobs from the queue, perform computation-heavy page rendering and
|
|
|
|
|
screenshotting, and store screenshots in the same queue/DB;
|
|
|
|
|
* Public service, when queried the status of a specific job, returns its status
|
|
|
|
|
along with the screenshot (if the job is completed already).
|
|
|
|
|
|
|
|
|
|
This way, quality of service would never degrade for the public service;
|
|
|
|
|
since it just adds jobs to queue and checks their status (which are relatively
|
|
|
|
|
lightweight tasks), it would always be responsible.
|
|
|
|
|
|
|
|
|
|
And, depending on the load (and budget considerations), new workers could be
|
|
|
|
|
added to reduce the queue length, and the amount of time clients would need to
|
|
|
|
|
wait for their jobs to be processed.
|
|
|
|
|
|
|
|
|
|
In order to further reduce load on the public service, it would make sense to
|
|
|
|
|
use the push model instead of the polling; when submitting a new job, clients
|
|
|
|
|
could provide their callback URL to be notified once the job is completed,
|
|
|
|
|
instead of constantly checking its status.
|
|
|
|
|
(Unfortunately, I was unable to implement this within the 8-hour time limit).
|
|
|
|
|
|
|
|
|
|
The architecture of this proof-of-concept project roughly resembles the ideal
|
|
|
|
|
architecture as described above, except that private workers live in the same
|
|
|
|
|
public service, negating all the potential scalability benefits.
|
|
|
|
|
However, it seems that with NestJS, workers could be trivially extracted from
|
|
|
|
|
the public service, as the communication between public controllers and workers
|
|
|
|
|
is done via Redis in this project.
|
|
|
|
|
|
|
|
|
|
## Screenshotting
|
|
|
|
|
|
|
|
|
|
First idea that came to my mind was that there ought to be headless Chrome
|
|
|
|
|
browser intended to be used for testing.
|
|
|
|
|
And sure there is; `puppeteer` package comes with a headless `Chrome`, and
|
|
|
|
|
allows to make screenshots.
|
|
|
|
|
This should be enough for the proof-of-concept.
|
|
|
|
|
|
|
|
|
|
In order to showcase the API capabilities, I originally intended to allow
|
|
|
|
|
clients to set additional options for screenshotting.
|
|
|
|
|
Unfortunately, I only had time to implement support for `imageType` (which
|
|
|
|
|
could be jpeg or png).
|
|
|
|
|
|
|
|
|
|
## Tests
|
|
|
|
|
|
|
|
|
|
Unfortunately, I didn't have enough time to ensure reasonable coverage, as I had
|
|
|
|
|
to spend most of these 8 hours getting myself acquainted with NestJS.
|
|
|
|
|
|
|
|
|
|
There is an end-to-end test that covers the basic positive scenario.
|
|
|
|
|
Unfortunately, it requires Redis to be up and running; I did not have enough
|
|
|
|
|
time to decouple it from Redis.
|
|
|
|
|
|
|
|
|
|
Also note that tests check some of the results against image snapshots stored in
|
|
|
|
|
repository, and of course screenshots on your platform may look different than
|
|
|
|
|
on mine, which can cause tests to fail (unless snapshots are recreated).
|
|
|
|
|
Tests pass on my system.
|
|
|
|
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
|
|
|
|
In addition to using this project as an API, you can also use the included
|
|
|
|
|
single-page application (available under `http://localhost:3000/spa` or similar).
|
|
|
|
|
|
|
|
|
|
Just enter the target URL, create the job, check its status.
|
|
|
|
|
Once the status changes to `'completed'`, you will see the screenshot.
|
|
|
|
|
|
|
|
|
|
## Other considerations
|
|
|
|
|
|
|
|
|
|
Since I did not have enough time to actually write the code (instead spending
|
|
|
|
|
most of it basically on learning NestJS), there are some issues with the code.
|
|
|
|
|
|
|
|
|
|
In particular, there are basically no tests besides one e2e test for the most
|
|
|
|
|
basic scenario, and there is some unneeded code duplication.
|
|
|
|
|
Additionally, classes naming could be better; and `ScreenshotsController`
|
|
|
|
|
contains a lot of code that does not belong to it (ideally, it would all be in
|
|
|
|
|
some `ScreenshotsService`, with `ScreenshotsController` just being an adapter
|
|
|
|
|
between that service and NestJS/API interface).
|
|
|
|
|
|
|
|
|
|
# Commands
|
|
|
|
|
|
|
|
|
|
## Configuration
|
|
|
|
|
|
|
|
|
|
Check `.env`
|
|
|
|
|
|
|
|
|
|
## Installation
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
$ npm install
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
## Running the app
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
# development
|
|
|
|
|
$ npm run start
|
|
|
|
|
|
|
|
|
|
# watch mode
|
|
|
|
|
$ npm run start:dev
|
|
|
|
|
|
|
|
|
|
# production mode
|
|
|
|
|
$ npm run start:prod
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
After starting the app, navigate to the displayed URL (`https://localhost:3000`)
|
|
|
|
|
in order to check out the API docs and the basic single-page screenshotting
|
|
|
|
|
application.
|
|
|
|
|
|
|
|
|
|
## Test
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
# unit tests
|
|
|
|
|
$ npm run test
|
|
|
|
|
|
|
|
|
|
# e2e tests
|
|
|
|
|
$ npm run test:e2e
|
|
|
|
|
|
|
|
|
|
# test coverage
|
|
|
|
|
$ npm run test:cov
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
**Note that you need to have redis running locally in order to run e2e tests**
|