Automated screenshots with Puppeteer & Headless Chrome

I’ve finally managed to ditch PhantomJS – it just didn’t want to run on a Raspberry Pi without more attention that i could afford it. An afternoon of tinkering later and i’m pretty happy with the replacement. I wrote the code on Windows 10 and then shipped it over to a RaspberryPi which seems to have gone to plan with no amendments required – the end game for this code is to sink it into Kodi Kontroller but for the moment it’s running on the same Pi quite happily.

Run through these steps:

Install Node.JS
Ensure npm is also installed
git clone
cd node-screengrabber
sudo nano hg-cron.js

At this point you’ll want to add your own parameters into the file, see the black highlighted areas all should be self-explanatory.

await page.setViewport({width: 1920, height: 1080});
await page.goto('YOUR-URL-HERE', {waitUntil: 'networkidle0', timeout: 60000});
await page.screenshot({path: 'c:/folder/file.png'});

You can remove the additional repetitions of the above lines if you only want to screenshot one page. You’ll also see lower down the file an example where I only use a portions of the webpage. Update the orange parts again – it’s a bit trial & error to get it perfect. x & y are used to position the top left point of the viewing rectangle within the larger webpage and then width & height are the size from that point set.

await page.goto('YOUR-URL-HERE', {waitUntil: 'networkidle0', timeout: 60000});
await page.screenshot({path: 'c:/folder/file.png',
clip: {
x: 0,
y: 0,
width: 1920,
height: 675

Save the file and exit nano.

npm install

The above command, if run from within the /node-screengrabber folder should install everything you need – this may take a few minutes depending on the platform / hardware you’re running. On a RaspberryPi it could take ten minutes or so. If you are running windows you’ll likely need to restart to ensure the environment variable is active.

This is the original script i put together:

Multiple files in one go:

You can see the hg-cron file is almost identical, just expanded.

node hg-cron.js

This will run the capture once and allow you to check all is working. I’m still yet to remove the cron function from hg-cron.js so it’ll run on every 5 minute interval – once this is removed you’ll get an instant output.

Thing of note in this code:

  1. clipRect is replaced with x & y coords
  2. width and heigh remain the same
  3. i used waitUntil: 'networkidle0', timeout: 60000to wait for the content to fully load before taking the shot by saying 0 connections remains, with a 60s timeout. This is much better than the previous model of wait an exact amount of time.
  4. It’s faster than PhantomJS and has just seems to render pages much better.
  5. Added CRON 🙂

I’ve decided to ditch the cron within the script – infrastructure wise it’s easier to have the code run and complete, and be triggered by a bash script running under the system cron.d. I’ve also had some help from another developer in the office to handle PowerBI dashboards and reports. That code is here – i’m putting it together slowly as a webapp – but need to get my head around that in a larger context first so it’s likely packaged incorrectly in some way, it’ll install fine, but you may need to take additional steps to get it running.

The whole project lives here:

If you’ve got your head around the simpler version of the coe you’ll be able to follow what the powerbi version is doing – add a few variables and all will be well.