Use Laravel Dusk, browser automation and PHP to programmatically surf the web

Connor Leech
Aug 10, 2018 · 3 min read

is a powerful browser automation tool for Laravel. With Dusk you can programmatically test your own applications or visit any website on the internet using a real Chrome browser. Using Dusk you can automate repetitive tasks, scrape information from other sites or test to make sure your app always works in the browser. In this tutorial we’ll go through how to create a job, login to a mythical website and click around.

Create a new Laravel app:

$ laravel new dusk-scraper
$ composer require --dev laravel/dusk
$ php artisan dusk:install
Dusk scaffolding installed successfully.
Laravel Dusk .

In the tests/DuskTestCase.php file that Laravel generated you will have a call to startChromeDriver in the prepare function (below). The prepare function gets called before the Dusk test is executed. It’s an abstract class so probably not a good place for us to put our code. We can make a new fresh dusk test case that extends the DuskTestCase with an Artisan command:

$ php artisan dusk:make ScrapeTheWebTest

This file (ScrapeTheWeb.php) will appear in tests/Browser directory. You can run the test with another Artisan command:

$ php artisan dusk

Right now it does not do anything. Here is the code to login to a website and click some buttons:

<?php

namespace
Tests\Browser;

use Tests\DuskTestCase;
use Laravel\Dusk\Browser;
use Illuminate\Foundation\Testing\DatabaseMigrations;

class ScrapeTheWebTest extends DuskTestCase
{

private $order_ids;

public function __construct($name = null, array $data = [], $dataName = '')
{
parent::__construct($name, $data, $dataName);

$this->user_ids = [
1,
2,
3,
];
}

/** @test */
public function loginAndClickButton()
{
$this->browse(function (Browser $browser) {
$browser->visit('https://website.com/login')
->type('input .usernameField', env('USERNAME'))
->type('input .passwordField', env('PASSWORD'))
->click('#login')
->waitForText('Orders');
@foreach($this->user_ids as $user_id)
{
$browser->visit('https://website.com/users/' . $user_id)
->waitForText('This is protected page')
->click('button .button-im-looking-4')
->waitForText('Page after the button')
->click('.another #button')
->pause(4000);
}

});
}
}

I’m using environment variables to store the values for username and password so in case they are sensitive you don’t have to check them in to version control. To find elements on the page use and browser devtools to target specific elements. We filter through some custom numbers and visit websites dynamically based on this data.

Your tests will run in the terminal with the php artisan dusk command. The fun really comes in when you see the browser perform the actions you specify. By default Laravel Dusk runs what’s called a headless browser that you won’t be able to watch. To watch the browser perform actions head to DuskTestCase.php that our ScrapeTheWebTest inherits from. Once there remove the --headless option:

/**
* Create the RemoteWebDriver instance.
*
*
@return \Facebook\WebDriver\Remote\RemoteWebDriver
*/
protected function driver()
{
$options = (new ChromeOptions)->addArguments([
'--disable-gpu',
//'--headless'
]);

return RemoteWebDriver::create(
'http://localhost:9515', DesiredCapabilities::chrome()->setCapability(
ChromeOptions::CAPABILITY, $options
)
);
}

With the headless option removed you can run the tests and watch the browser perform the actions that you specified! From this command you can use the full power of Laravel to create database records, trigger jobs, update data or anything else you can think of.

For further reading check out . It’s a great article that goes through the process of programmatically creating Github accounts, storing information from the web in a database and even integrating with the Github API to find users credentials.

Paul Redmond has some great slides about scraping the web using and PHP.

This technology is very powerful. Don’t be a jerk.

Employbl

Employbl is a database of active candidates in the Bay Area. I generate leads for talent teams

Connor Leech

Written by

Coder and father. Employbl is a tech recruitment tool for candidates and employers: https://employbl.com/

Employbl

Employbl

Employbl is a database of active candidates in the Bay Area. I generate leads for talent teams