The Selenium WebDriver — Under the hood — Chapter 2

Anji Boddupally
3 min readFeb 18, 2023

--

Note: This article is written by assuming that the reader has a good understanding of Java programming language.

In our previous article we learnt how to interact with a browser using RESTful services provided in WebDriver W3C Specs.

Now, we will automate the same process using Java. Thus, we will be implementing our own version of WebDriver API to interact with browser driver (Chrome Driver) which internally talks to its browser (Chrome).

When we create any instance of WebDriver implementation (ChromeDriver, FirefoxDriver etc), browser window will be launched. So internally what happens is, it starts the specific driver executable & session will be created. In order to simulate the same, we will be starting the driver executable with any dynamic port & we will call the session API. There are two ways to start the driver executable in Java.

Method 1: By utilising the existing implementation ChromeDriverService class from Selenium as below

// To start the Driver Executable

ChromeDriverService service = new ChromeDriverService.Builder().usingDriverExecutable(new File(<path of the driver>)).usingPort(port).build();
service.start();

// To stop the Driver
service.stop();

Method 2: By creating our own method to execute the driver using 'Apache Commons Exec' apis as below

// To start the Driver Executable

private void startDriver(String driverPath, String... args) {
CommandLine cmd = new CommandLine(driverPath);
cmd.addArguments(args, false);
DefaultExecuteResultHandler handler = new DefaultExecuteResultHandler();
Executor executor = new DaemonExecutor();
try {
executor.execute(cmd, handler);
Thread.sleep(1000);
} catch (IOException | InterruptedException e) {
e.printStackTrace();
}
}


// To stop the Driver, we will execute '/shutdown' api

public void quit() {
// service.stop();
try {
GenericHttpClient gp = new GenericHttpClient(this.url);
ApiResponseImpl<String> createSessionResponse = gp.postReqest("{ }", "/shutdown");
if (createSessionResponse.getResponseCode() == 200) {
System.out.println("Destroyed successfully!!");
}
} catch (Exception e) {
e.printStackTrace();
}
}

Once the Driver is started successfully, in order to launch the browser, we need to call '/session' endpoint. If this call is successful, we will be seeing the browser window opened.

 this.url = "http://localhost:" + String.valueOf(this.port);
GenericHttpClient gp = new GenericHttpClient(this.url);
// Not Sending any capabilities
String body = "{\n" + " \"capabilities\":{}\n" + "}";
ApiResponseImpl<String> createSessionResponse = gp.postReqest(body, "/session");
if (createSessionResponse.getResponseCode() == 200) {
System.out.println("Session created");
CreateSessionResponse createSesResObj = convertJsonToPojo(createSessionResponse.getResponseAsString(),
CreateSessionResponse.class);
this.sessionId = createSesResObj.getValue().getSessionId();
System.setProperty("com.anji.wb.base.url", url);
} else {
throw new Exception(
"Sessions not created, check for logs:\n " + createSessionResponse.getResponseAsString());
}

Once the browser is opened, we will launch browser using '/url' API and we will be identifying the elements on the page using '/element' API. We will be also performing few actions like — 'click' & 'entering the text into input'.

// to launch the url
@Override
public void loadUrl(String urlString) {

String body = "{\"url\": \"%s\"}";
try {
CustomWebDriver.isValidURL(urlString);
ApiResponseImpl<String> resp = gp.postReqest(String.format(body, urlString),
String.format("/session/%s/url", this.sessionId));
if (resp.getResponseCode() == 200) {
System.out.println("URL launched successfully");
} else {
throw new Exception("URL is not launced, check for logs\n" + resp.getResponseAsString());
}

} catch (Exception e) {
e.printStackTrace();
}

}

// to find an element on the page
public CustomWebElement findElement(CustomBy by, String locator) {
String body = String.format("{\"using\": \"%s\", \"value\": \"%s\"}", by.getStrategy(), locator);
String elementId = null;
try {
ApiResponseImpl<String> resp = gp.postReqest(body, String.format("/session/%s/element", this.sessionId));
if (resp.getResponseCode() == 200) {
System.out.println("Element found successfully");
System.out.println("Resp: " + resp.getResponseAsString());
ElementResponse value = convertJsonToPojo(resp.getResponseAsString(), ElementResponse.class);
for (String key : value.getValue().getElementId().keySet()) {
elementId = (String) value.getValue().getElementId().get(key);
}
return new ChromeElement(this.sessionId, elementId);
} else {
throw new Exception("Element is not found or no such element\n" + resp.getResponseAsString());
}
} catch (Exception e) {
e.printStackTrace();
}
return null;
}

// click
@Override
public void click() {

String body = "{ }";
try {
ApiResponseImpl<String> resp = gp.postReqest(body, partUri + "/click");
if (resp.getResponseCode() == 200) {
System.out.println("command executed successfully");
} else {
throw new Exception("command is failed to execute, check for logs:\n" + resp.getResponseAsString());
}

} catch (Exception e) {
e.printStackTrace();
}
}

// send keys
@Override
public void sendText(String text) {

String body = String.format("{\"text\": \"%s\"}", text);
try {
ApiResponseImpl<String> resp = gp.postReqest(body, partUri + "/value");
if (resp.getResponseCode() == 200) {
System.out.println("command executed successfully");
} else {
throw new Exception("command is failed to execute, check for logs:\n" + resp.getResponseAsString());
}

} catch (Exception e) {
e.printStackTrace();
}

}

To execute E2E flow in your local with above custom implementation, clone this project and run the test AnjiWDTest
Note: This project has dependency on the project anji-lytweight-rest-api-framework which provides APIs to interact with any RESTful service.

--

--