GSoC’18 — Mozilla (autowebcompat)

3 min readAug 8, 2018

I would like to thank Marco Castelluccio for being a great mentor. His constant support and guidance helped me complete the project.
Working with Mozilla was a lot of fun which gave me a great learning opportunity. It was a cool summer, full of code, travel and coffee in himalayas!

Project Overview:

Aim of the project is to create a tool to detect cross browser compatibility issues using Deep Learning models on screen shots taken across different websites across different browsers.

Link to project repository:
https://github.com/marco-c/autowebcompat/

Work Summary

Training

Write training info to file:-
For analyzing results of training on different models it was best to write the results along with model, machine info and training info including history of training in a file.
* Write training info to file
* Get GPU info from tensorflow
Create notebook for google colab:-
For running training on Google Colab GPUs, notebook was written.
* Notebook for Google Colab
Add different networks for training (with pre-trained weights):-
* VGG19
* VGG16
* Resnet50
* Pretrained Weights
Run different models:-
Results for different models are present in this issue.

Labeling

Label Images with bounding boxes:-
* Label Images
* Label Images
Improving labeling experience:-
To the labeling tool, to mark bounding boxes on screen shots for incompatibilities some additional functionalities were added
* Print help in the tool
* Change/verify previously labeled images
* Images of same webpage come together
* Prefer images not labeled by anyone
A labeling guide was also added with some pre-labeled images so that users know how labeling has to be done.

Crawling

Get full page screen shot in crawler:-
Since we wanted to test web compatibility of full web page in two browsers full page screen shot was captured with a new naming convention.
* Full-page screenshot
* New naming convention
Get location of all elements in crawler:-
For all the elements on the web page their locations were mapped to their respective xpaths.
* Get all element locations
Choose all elements to interact with:-
We wanted to interact with all possible elements on the webpage and till a given depth so we implemented a DFS based approach for same with modifications to improve on the speed.
* Get all possible screenshots
* Improve speed using xpaths

DOM based compatibility

Dom-baseline technique:-
It was proposed to have a dom - baseline technique to compare our machine learning models to the models which mark compatibility on the basis of DOM. We used xpert (a tool written in java) for the same. To make it compatible with present day browsers (since tool was last updated in 2014) some changes in source code were made.
Finally, after seeing the results we decided to use it as the dom-baseline and the complete code was written in python so that it could be compatible with our project.
* Dom-based basic technique
* Dom-based technique with layout (PR — open)

Link to all PRs :- pull requests
Link to all commits :- commits
Link to all issues opened :- issues

Learning Experience and Challenges Faced

GSoC was a great learning experience for me as it was the time I had recently started contributing to open source. I learned how to write quality code which is simple and short along with being readable and understandable at the same time.

One major challenge I faced was to run xpert smoothly for latest browsers since major changes have happened in its dependencies since it was last updated. Understanding the complete code and making it run with changes in the source code was definitely challenging and interesting at the same time.