I wrote the code and deployed it to an Ubuntu VM in Azure within a single evening of hacking. Docker and the docker-compose tool made the deployment and update process extremely quick.
If you've already been through the [Hands-On Docker tutorial][1] then you will have experience linking Docker containers on the command line. Linking a Node hit counter to a Redis server on the command line may look like this:
Explicit linking through `--link` is just about manageable with a couple of containers, but can get out of hand as we add more tiers or containers to the application.
The docker-compose tool is part of the standard Docker Toolbox and can also be downloaded separately. It provides a rich set of features to configure all of an application's parts through a plain-text YAML file.
From Docker 1.10 onwards we can take advantage of network overlays to help us scale out across multiple hosts. Prior to this linking only worked across a single host. The `docker-compose scale` command can be used to bring on more computing power as the need arises.
There is a huge buzz around the Raspberry PI Zero - a tiny microcomputer with a 1GHz CPU and 512MB RAM capable of running full Linux, Docker, Node.js, Ruby and many other popular open-source tools. One of the best things about the PI Zero is that costs only 5 USD. That also means that stock gets snapped up really quickly.
I found a webpage which used screen scraping to find whether 4-5 of the most popular outlets had stock.
- The site contained a static HTML page
- Issued one XMLHttpRequest per outlet accessing /public/api/
- The server issued the HTTP request to each shop and performed the scraping
Every call to /public/api/ took 3 seconds to execute and using Apache Bench (ab) I was only able to get through 0.25 requests per second.
### Reinventing the wheel
The retailers didn't seem to mind whereismypizero.com scraping their sites for stock, so I set about writing a similar tool from the ground up. I had the intention of handing a much higher amount of requests per second through caching and de-coupling the scrape from the web tier. Redis was the perfect tool for the job. It allowed me to set an automatically expiring key/value pair (i.e. a simple cache) and also to transmit messages between Node processes through pub/sub.
>Fork or star the code on Github: [alexellis/pi_zero_stock][4]
If you've worked with Node.js before then you will know it is single-threaded and that any CPU intensive tasks such as parsing HTML or JSON could lead to a slow-down. One way to mitigate that is to use a second worker process and a Redis messaging channel as connective tissue between this and the web tier.
- Web tier
-Gives 200 for cache hit (Redis key exists for store)
-Gives 202 for cache miss (Redis key doesn't exist, so issues message)
-Since we are only ever reading a Redis key the response time is very quick.
- Stock Fetcher
-Performs HTTP request
-Scrapes for different types of web stores
-Updates a Redis key with a cache expire of 60 seconds
-Also locks a Redis key to prevent too many in-flight HTTP requests to the web stores.
```
version: "2.0"
services:
web:
build: ./web/
ports:
- "3000:3000"
stock_fetch:
build: ./stock_fetch/
redis:
image: redis
```
*The docker-compose.yml file from the example.*
Once I had this working locally deploying to an Ubuntu 16.04 image in the cloud (Azure) took less than 5 minutes. I logged in, cloned the repository and typed in `docker compose up -d`. That was all it took - rapid prototyping a whole system doesn't get much better. Anyone (including the owner of whereismypizero.com) can deploy the new solution with just two lines: