You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Inga 🏳‍🌈 c16b30b777 added redis disclaimer for npm start 3 months ago
client added solution description 3 months ago
src increased viewport for improved screenshotting experience 3 months ago
test increased viewport for improved screenshotting experience 3 months ago
.eslintrc.js created empty NestJS project 3 months ago
.gitignore created empty NestJS project 3 months ago
.prettierrc created empty NestJS project 3 months ago
LICENSE Initial commit 3 months ago
README.md added redis disclaimer for npm start 3 months ago
nest-cli.json configured openapi docs 3 months ago
package-lock.json enabled request validation 3 months ago
package.json enabled request validation 3 months ago
tsconfig.build.json screenshotter implemented 3 months ago
tsconfig.json connected queue to screenshotter; first functional version 3 months ago

README.md

Assignment

This assignment is designed to take a day's worth of work (8 hours). if you find that some things are not possible to complete in that timeframe, that's okay – prioritize as you see fit.

Datawrapper Test Project: Backend

Overview

The project is to create a web service that takes screenshots of websites.

The service should expose an API that a client can use to create a new screenshot request, with any given URL as a parameter. The web service should then return a status URL that the client can periodically call to receive updates on the status on its request. The web service should take a screenshot of the provided URL. Once the screenshot was taken, it should be possible for the client to retrieve it. The service should be designed in a scalable fashion, so that it can handle varying amounts of parallel requests in a stable and efficient fashion.

The API notation can be defined as you see fit.

Technology stack

  • Server-side code should be written in JavaScript or TypeScript
  • It should be possible to run the application locally (ideally as a containerized application, but other technologies are allowed as well)
  • Any other tools, technologies or frameworks can be used at your discretion
  • A hosted version of the application to test is a plus, but not a must

Delivery

The project should be delivered as a Github repository. It should be possible to run the service locally.

Solution

Service boilerplate

I'd say that this is a very odd test assignment, since it requires one to create an entirely new service, which involves a lot of boilerplate, and takes most of the allotted time. This assignment effectively tests whether the applicant is capable of a new service on their own, from scratch, which is (hopefully) very unrepresentative of the actual job responsibilities; typically, companies have established procedures for creating new services, often they have templates and starters, and they definitely have established stack and dependencies.

At my current company we're using tsoa for microservices, with our own templates and starters. However, creating a new tsoa project from scratch, without using these templates, and without looking at the code belonging to the company, would demand a lot of time. So I decided to search for an alternative, and settled on NestJS for the purpose of this assignment.

Note that this is my first experience with NestJS, so I may violate some best practices, just because I'm not aware of them. NestJS also turned out to be surprisingly powerful for creating small proof-of-concept services.

Scalability

The requirement that the service should be scalable and handle high load calls for an obvious solution: queues. If this was an actual service and not an MVP, the architecture would probably look like this:

  • Public service accepts a screenshotting job, stores it in queue;
  • A number of private workers (serverless functions, or actual private services) pick up jobs from the queue, perform computation-heavy page rendering and screenshotting, and store screenshots in the same queue/DB;
  • Public service, when queried the status of a specific job, returns its status along with the screenshot (if the job is completed already).

This way, quality of service would never degrade for the public service; since it just adds jobs to queue and checks their status (which are relatively lightweight tasks), it would always be responsible.

And, depending on the load (and budget considerations), new workers could be added to reduce the queue length, and the amount of time clients would need to wait for their jobs to be processed.

In order to further reduce load on the public service, it would make sense to use the push model instead of the polling; when submitting a new job, clients could provide their callback URL to be notified once the job is completed, instead of constantly checking its status. (Unfortunately, I was unable to implement this within the 8-hour time limit).

The architecture of this proof-of-concept project roughly resembles the ideal architecture as described above, except that private workers live in the same public service, negating all the potential scalability benefits. However, it seems that with NestJS, workers could be trivially extracted from the public service, as the communication between public controllers and workers is done via Redis in this project. (Using Redis was not a conscious choice; it is just that NestJS integrated queues use Redis internally and do not support other storage/queue backends).

Screenshotting

First idea that came to my mind was that there ought to be headless Chrome browser intended to be used for testing. And sure there is; puppeteer package comes with a headless Chrome, and allows to make screenshots. This should be enough for the proof-of-concept.

In order to showcase the API capabilities, I originally intended to allow clients to set additional options for screenshotting. Unfortunately, I only had time to implement support for imageType (which could be jpeg or png).

Tests

Unfortunately, I didn't have enough time to ensure reasonable coverage, as I had to spend most of these 8 hours getting myself acquainted with NestJS.

There is an end-to-end test that covers the basic positive scenario. Unfortunately, it requires Redis to be up and running; I did not have enough time to decouple it from Redis.

Also note that tests check some of the results against image snapshots stored in repository, and of course screenshots on your platform may look different than on mine, which can cause tests to fail (unless snapshots are recreated). Tests pass on my system.

Usage

In addition to using this project as an API, you can also use the included single-page application (available under http://localhost:3000/spa or similar).

Just enter the target URL, create the job, check its status. Once the status changes to 'completed', you will see the screenshot.

Note that this is not intended to be a nice or beautiful application; on the contrary, this is something very rough and barely working, quickly hacked together simply to demonstrate how can API be used, with the goal of spending as little time on it as possible (so it had to be plain JS, without types, without any front-end frameworks, etc).

Other considerations

Since I did not have enough time to actually write the code (instead spending most of it basically on learning NestJS), there are some issues with the code.

In particular, there are basically no tests besides one e2e test for the most basic scenario, and there is some unneeded code duplication. Additionally, classes naming could be better; and ScreenshotsController contains a lot of code that does not belong to it (ideally, it would all be in some ScreenshotsService, with ScreenshotsController just being an adapter between that service and NestJS/API interface).

Commands

Configuration

Check .env

Installation

$ npm install

Running the app

# development
$ npm run start

# watch mode
$ npm run start:dev

# production mode
$ npm run start:prod

Note that you need to have redis running locally in order to run the app

After starting the app, navigate to the displayed URL (https://localhost:3000) in order to check out the API docs and the basic single-page screenshotting application.

Test

# unit tests
$ npm run test

# e2e tests
$ npm run test:e2e

# test coverage
$ npm run test:cov

Note that you need to have redis running locally in order to run e2e tests