Flutter — Extract Web Page HTML Content From WebView

Janith Ganewatta
4 min readJun 3, 2020

--

While I was working on a Flutter app I wanted to extract some HTML elements from a webpage loaded to webview. And I didn’t find a guide about how to do it. So I figured out a way and hope this helps to anyone looking into this. 🙂
This will be helpful to you when you are integrating a 3rd party service to your app such as payment gateways.

I used webview_flutter (https://pub.dev/packages/webview_flutter) for adding the webview to flutter.

In our process we load a webpage URL to web view and the website process user’s input and redirect to a success page. The website is not owned by us.
At the final redirected page I wanted to extract some values from HTML.

I will first post the code and then explain it.

import 'package:flutter/material.dart';
import 'package:webview_flutter/webview_flutter.dart';
class WebViewExample extends StatefulWidget {@override
_WebViewExampleState createState() => _WebViewExampleState();
}class _WebViewExampleState extends State<WebViewExample> {// Reference to webview controller
WebViewController _controller;
@override
Widget build(BuildContext context) {
return Scaffold(
appBar: AppBar(
title: Text('Flutter Web View Example'),
),
body: Container(
child: WebView(
initialUrl: 'https://yourwebsite.com',
javascriptMode: JavascriptMode.unrestricted,
onWebViewCreated: (WebViewController webViewController) {
// Get reference to WebView controller to access it globally
_controller = webViewController;
},
javascriptChannels: <JavascriptChannel>[
// Set Javascript Channel to WebView
_extractDataJSChannel(context),
].toSet(),
onPageStarted: (String url) {
print('Page started loading: $url');
},

onPageFinished: (String url) {
print('Page finished loading: $url');
// In the final result page we check the url to make sure it is the last page.
if (url.contains('/finalresponse.html')) {
_controller.evaluateJavascript("(function(){Flutter.postMessage(window.document.body.outerHTML)})();");
}
},
),
),
);
}
JavascriptChannel _extractDataJSChannel(BuildContext context) {
return JavascriptChannel(
name: 'Flutter',
onMessageReceived: (JavascriptMessage message) {
String pageBody = message.message;
print('page body: $pageBody');
},
);
}
}

Okay here is how it works,

1. We create the flutter webview with Javascript Enabled.

child: WebView(
initialUrl: 'https://yourwebsite.com',
javascriptMode: JavascriptMode.unrestricted,
),

2. Then we create a method which returns our JavascriptChannel.

JavascriptChannel _extractDataJSChannel(BuildContext context) {
return JavascriptChannel(
name: 'Flutter',
onMessageReceived: (JavascriptMessage message) {
String pageBody = message.message;
print('page body: $pageBody');
},
);
}
}

I named the channel as “Flutter”, But you can change it to your app name or anything you like.
here we define the second parameter “onMessageReceived” to a callback method. I will explain it below.

3. Then we set the javascript channel to our WebView. What this does is it create an object in DOM of the page and it acts like a bridge between the webview and the app.

javascriptChannels: <JavascriptChannel>[
// Set Javascript Channel to WebView
_extractDataJSChannel(context),
].toSet(),

4. Now we need to get a reference to WebViewController of the WebView.
So we define a WebViewController reference to hold it and then we use
“onWebViewCreated” callback to grab and put it to our reference.

onWebViewCreated: (WebViewController webViewController) {
// Get reference to WebView controller to access it globally
_controller = webViewController;
},

5. In my case the website is going to have 3 steps. First page gets user’s input, second page verifies and shows data, final page is going to show the success page.
So I just need to extract the data from the last page.
I can determine the loaded page url by “onPageFinished” callback.
And then I am going to inject and run the Javascript to return the body of the webpage.
This “onPageFinished” is called every time a page is loaded. That is why we need to check the url to make sure it is the page we want.

onPageFinished: (String url) {
print('Page finished loading: $url');
// In the final result page we check the url to make sure it is the last page.
if (url.contains('/finalresponse.html')) {
_controller.evaluateJavascript("(function(){Flutter.postMessage(window.document.body.outerHTML)})();");
}
},

If you don’t have several steps on webpage, you can just remove the if statement and call _controller.evaluateJavascript method.

The Javascript we use:

(function(){Flutter.postMessage(window.document.body.outerHTML)})();

Flutter” is our Channel name.
This code is called as soon as it is injected to DOM and it extracts the HTML body of the page and pass it our channel using the postMessage(msg) method.

Then in the Step 2 when we create the JavascriptChannel we defined “onMessageReceived” callback.
This is where it receives the content which sent by “postMessage(msg)” method.

So now you can parse the HTML and extract the values you need. 🙂

As soon as the webview is created, we get the reference to web view controller. And then it creates our JavascriptChannel on DOM and it is available by the time page finish loading. That is why we call to evaluate the JS on page finish with our controller.

Check about Window.postMessage() on MDN.
https://developer.mozilla.org/en-US/docs/Web/API/Window/postMessage

--

--