Getting started with PuppeteerJS
Let’s build something in puppeteer
Interested in watching the video? I have made a youtube video about this topic https://youtu.be/KowSdMQTJeo
Puppeteer is a nodeJS library that provides an API to automate a launched instance of chrome and chromium browsers. It provides APIs for everything which can be done manually in a browser. Some applications for puppeteer are automating a repetitive task done on a website or end to end testing of your application.
Today we are going to write a puppeteer script with a very specific task. The task is to
- Go to google.com
- Search for a given string
- Open the first link
- Generate a full-page screenshot of the website we just opened and save it into our system.
So let’s get started.
We will start by creating a folder and initializing a node project in it.
mkdir puppeter-getting-started
cd puppeter-getting-started
npm init
Just hit enter for all the questions asked by npm init
Once we initialized the node project, we can go and install puppeteer
yarn add puppeteer
Let’s create our script
touch google.js
Let’s start by importing puppeter
and defining our main function
Now we are going to launch the browser
puppeteer.launch
returns a promise which upon successful resolution give us the launched browser instance. We have provided headless
to false as we want to see the browser window, by default the headless
is true which means everything happens in the background so no UI will be shown.
Now let’s navigate to google.com
Once the browser is launched we create a new tab using browser.newPage()
, then we configure the viewPort to have a width
of 1920 and height
of 1080 and at last we navigate to https://www.google.com
Now it’s time to start typing in some search queries.
After we navigated to google.com
we find a unique css selector to identify the search input on the page and pass that to page.type
alongside our query steve jobs
page.type
takes our query, finds the element using the css selector provide and types in our query.
One important note here is always try to find unique css selector to select elements because there can be a lot of elements in the webpage and we have to make sure we always click on the right one.
After our query is in place we get the instance of the search input and press enter on it using element.press
Okay, so good so far. We have already performed a search let’s click on the first link we encounter
We use page.waitForNavigation
beause we don’t want to perform the click before the search result page load. If we do that the page.click
method will never find the link as it is not loaded yet.
On line number 24 we call page.click
and provide the css selector of the element to be clicked and then again wait for the next page to load. We are waiting because we will take the screenshot of the whole webpage in the next few steps, which would not be possible until the page is loaded.
Let’s take a screenshot now.
In the end, once the webpage is loaded, we take a screenshot of the whole page using page.screenshot
We provide the path
where we want the file to be saved. We also pass fullPage
to be true as we want to take the screenshot of the whole page not what is just shown on the first render without scrolling.
At last we close the browser using browser.close()
This is the simples example I can come up with but puppeteer have a lot of methods and features I haven’t covered in this post go and check them out here https://pptr.dev/
Here is the full repo for this example https://github.com/manojsinghnegiwd/puppeteer-getting-started