Getting into Puppeteer : Inject | Interact | Keys | Capture | Select

Girish patil
HackerNoon.com
9 min readNov 13, 2017

--

At the end of this article we will learn how to

  1. Inject custom Javascript functions into another page’s context.
  2. Interact with forms to automate things.
  3. Capture screenshots of particular elements from the page
  4. Using keyboard/mouse events to type/click selected elements.
  5. Capture screenshots
  6. Selecting elements from the DOM.
  7. Accessing cookies

Introduction

Headless browsers are really making some good contribution in browser automations and testing areas. Most of which are used for unit/end to end testing, while few are just perfect for browser automation. Invisible browsers or technically known as headless browsers are those which have all the functionality/features of a normal browser but are usually executed through command line/code. Mostly used for testing but can also be used for web scraping, screenshots capturing, injecting scripts to automate the interaction on the websites.

In this article we will be going through newly launched Google chrome node api Puppeteer by Google devtools team. Along with puppeteer there are some other that are recently released including Chromeless, Chrominator, Chromy, Navalia, Lambdium , nightmare js (which is similar to Puppeteer but uses electron behind the scenes while Puppeteer is solely built on chromium) and the good old phantomjs which provides solid cross browser api for testing.

About puppeteer

“Puppeteer is a Node library which provides high-level API to control headless Chrome over the devtools protocol. It can also be configured to use in full ( or non-headless) Chrome “ is what Google dev team says.

Puppeteer can be used to

  • Automate visual testing.
  • Generating images/pdf screenshots of websites without opening the browser.
  • Form submission / Browser usage simulation
  • Scraping websites
  • or to capture the timeline

and many more.

Lets get started with the basic usage of puppeteer.

You can also give it a try at the new Puppeteer’s playground https://try-puppeteer.appspot.com/

Installing puppeteer is not tricky at all, I will walk you through the quick setup,

On a Ubuntu machine

sudo apt-get update

Installing node v8.4.0 from nodesource. Find yours at https://github.com/nodesource/distributions

along with that you need to have these too

On Ubuntu

Install npm

Now into your working directory run

and you are good to go. Running the above command it installs recent version of Chromium and puppeteer. Remember there is no compulsion of having a display because it is a headless browser it can run on servers/only command line. If at all you run into some problems do visit this link for troubleshooting https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md and if you are on ubuntu xenial no problems at all, follow my lead https://github.com/GoogleChrome/puppeteer/issues/290#issuecomment-324838511

Getting started

Lets generate a screenshot of a website in iPad pro dimensions. (768px * 1024px) ( as per http://screensiz.es/ ).

After installing puppeteer open a .js file lets call it screenshot.js file in the npm project folder and add the below code to it..

now run the script in terminal

after the code successfully runs you will have a screenshot of google.com and with one line of code you can even save it as a pdf. Add the below code before browser.close()

Other than A4 you can even capture other sizes Letter, Legal, Tabloid, Ledger, A0, A1, A2, A3, A4, A5. More about it here

Lets use the screenshot feature in slightly different manner. Recently while I was scrolling through my feed on twitter I saw an user tweet on some X company that provides screen capture as a service for a new feature. What he wanted was , when you pass an element’s ID or class that element must be captured instead of the whole page. Just the element. So I thought of implementing it here.

To do this. We will have to find the position and dimensions of that element. Using these details we can clip the screenshot .In puppeteer you can clip the particular area of a screen shot. More about this api’s method here https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagescreenshotoptions

We will use $eval method of page api to achieve this. it accepts a selector and function to be executed in the page’s context as parameters. the offset height and width gives the dimensions of the element and the offsetTop and offsetLeft gives the position which are needed to clip the area.

Below script captures the Smashing magazine’s sidebar.

The image on the left is the sidebar of SM website. These tweaks are particularly helpful in many situations . It can be improvised and implemented in your own way for you need.

Now lets move on to interacting with a form

Interacting with the webpage/form

I consider this to be the most powerful part of Headless browsers. Interacting with the page as though it is opened in a browser.

Lets capture a screen shot of the big brother https://google.com

The screenshot method of Page api does the job for us.

Checkpoint #1 : Now lets start typing in it without opening the browser, how cool is it!!!. Before you type you have to focus on the input element as you would if you were using a browser. To do that we have to focus on the input element. Before that lets find the element to focus on

Checkpoint #2 : Right click on the input searchbar and select inspect element.( To learn more about using devtools inspection visit here https://developer.chrome.com/devtools ). Now select the id of that element, btw its “lst-ib” and I don’t know what lst-ib means anyways, page.click(), page.type(), page.tap() are few of the interaction handling functions among many others. Focus on the element by clicking on it using page.click(element) and then start typing by using page.type(“pass what needs to be typed”) and then capture the screenshot.

Checkpoint #3 : Now we have to click the submit or press the enter key. Lets explore both the ways and you will get to know how flexible the api is even to handle key presses.

Now that we have typed , lets click the “Google search” button. Because it doesn’t have any id specified (element inspection) we have to use any other selectors You can try this in the console at google’s homepage

gives you the submit input element.

Now in puppeteer we have to click the button and wait for it to navigate which can be done with page.waitForNavigation(); and then capture screenshot again.

Now lets try the enter key search way. After the checkpoint #2 notice the use of keyboard class’s method down().

Till now we have seen how to select elements using selectors and puppeteer to interact with form fields, simulating mouse and keyboard events.

Running custom javascript functions inside pages context.

Lets execute a function inside the page’s context and change the google’s logo to smashing magazine’s.

By inspecting the google’s logo in google.com which has ‘hplogo’ as the id. We will be using the puppeteer’s page api’s eval function which takes in the selector and the function along with its arguments as its arguments, evaluates the passed function inisde the page’s context in this case google.com

Misc

Sometimes you will have to see what the browser is displaying, so if you want to run it headless

const browser = await.puppeteer.launch({headless : false});

This will run the script as well as launch the browser.

Lets access the cookies stored by google.com

page.cookies class will give you access to the page’s cookies.

Output on my box.

Recently many headless browsers are in space like Chromeless : https://github.com/graphcool/chromeless Chrominator : https://github.com/jesg/chrominator. and many others.

Useful links

https://github.com/GoogleChrome/puppeteer/

--

--