Modifying Expectations: Progress Report

Ekta Mishra
Code for Cause
Published in
4 min readJul 7, 2020

Halfway journey as an Outreachy intern for OpenRefine organization

As we all know, before the contribution period we have a list of organizations and projects from which we can easily choose projects and organizations of totally our choice. It’s the most delightful thing about Open-Source programs like Outreachy, GSoC, etc.

So, when the choice came into my hands, I started making my preferences according to the skills I have already have and those which I was keen to learn. I had a look at nearly every organization, there were about 20–25 organizations so giving a bit of time to read about it, to see what kind of products they make wasn’t really a big task. Exploring organization in my sense doesn’t really means to contribute to it or to set up their products, no not at all. But to just get a gist of what kind of products/ projects they have, so it will not take a lot of your time.

For instance, you love to learn about networks, and then there is some organization which works for serverless network system, etc. Then, working with that organization will make your journey very delightful.

The same is the case when you choose the project on the basis of skills you have or you have an interest in them. I chose the OpenRefine project based on Java — “Implement more constraint checks in OpenRefine’s Wikidata extension.

So, the project was to add support for new Wikidata Constraint checks. OpenRefine already has a no. of constraints implemented in it. The project scope also involves improving the existing functionality of the constraints.

I started with the implementation of some simple constraints in 1st phase which was in accordance with my final proposal/timeline I submitted in Outreachy final application. It went smooth I successfully added support for 3 new constraints and along with that improved some basic things in existing constraints. Moving forward with the existing architecture of implementing constraints, I and my mentor found that we need to tweak the existing architecture of Scrutinizer classes and especially the testing of the scrutinizers to improve the readability, understanding, and performance of the constraint checks.

Then, we decided before implementing or adding support for constraint checks we should improve the current architecture a bit. Earlier the testing classes were depended on MockConstraintFetcher for fetching the required parameters and other data. But in reality, it wasn’t fetching, it provides hardcoded data from that class. Therefore, a better approach was required. We decided to use mocks in the test classes to send the parameters etc for which we preferred Mockito Library.

I wasn’t familiar with the Mockito testing framework but I studied about it from some tutorial series and took help from my mentor whenever required. After this, we changed the testing architecture to a completely new look which is in my sense more understandable and powerful. Now, we can add as many different test cases in the test files as we want without actually depending on any other file to provide support for it. In reality, this is actually what the unit testing should be where testing of a single unit(i.e. class in Java) shouldn’t depend on the performance of other units.

Thereafter, we went on to the implementation of the Scrutinizer classes. In those, we are fetching the parameters and checking if the particular property has this constraint applied to it or not. For that also, we shifting the fetching and parsing code for each constraint in the Scrutinizer class itself by adding an inner class in each of them defining the Constraint.

So, this was how we shifted the goal of our original project to improve the end results of the project. After done with the refactoring of architecture, now we are back on track again. I have started implementing and adding support for the new constraint checks for Wikidata again. But changing the architecture is making the workflow much easier and smoother for me now.

Progress so far: https://github.com/OpenRefine/OpenRefine/pulls/darecoder1999

Also, I have passes the 1st phase evaluation successfully & got the initial phase payment. Done with the midpoint feedback too, waiting for its results. This was all about an overview of my halfway journey till now as an Outreachy intern. More learning and development is yet to do. :)

--

--

Ekta Mishra
Code for Cause

Software Engineer @PhonePe | Former @RedHat'21 &Outreachy’20 intern @OpenRefine | Google Code-In’19 Mentor @JBoss | Teaching Assistant @Coding Blocks