Monthly Learnings in Google Summer of Code 2020

Mapping my Journey through GSoC’20 — Pilot.

My First Month as a Student Software Developer!

Aryan Gupta
13 min readJul 20, 2020

--

Mission Support System — Python Software Foundation

The point of this blog — I am already writing weekly blogs here. But they are supposed to be professional, concise and related to work. There is so much going on — GSoC, pandemic, personal… a lot! And I wish to jot it all down.

For context, I am Aryan Gupta, developing the MSS Software under Python Software Foundation through Google Summer of Code, 2020. To know more about my organisation and project, read here.

Originally this was going to be a weekly blog, but time passes so fast nowadays. Listlessly. Mindlessly.

Its a good time to recount my last 2 months.

Community Bonding Period : (May 4 to June 1)

In this period I got to know my 3 mentors : Joern Ungermann , Reimar Bauer and Christian Rolf. They are really cool scientists!

MSS came into existence mainly because of their work. These scientists use MSS on experimental campaigns to determine where and how to fly on the German HALO Aircraft.

The KML feature of the map was ‘hacked’ up when they had to mark locations on flight that they needed to avoid. But it is still an Experimental Feature. Lots of things for me to do!

I was in complete awe when they were sharing their work with me. Amazing people, amazing lives.

‘Aryan, can you tell us briefly something about you?’
Umm… I’d rather not, haha

‘Whats your coding experience?’

I have been working with Python for about 6 months now.I also have done some normal programs .. in C, C++, Java.. but I have kinda forgot about it..

Joern : No one, except Stroustrup, knows all of C++.

Which made me wonder if he really knew Stroustrup. Looking at his stature, he probably did.

I was very confused about my project, really nervous.

See, I had written my proposal in the same way we criticise each other;
I tried out Google Earth, the KML features it had, and compared it with the MSS Software. Then I filled up my proposal with “ This is missing, that is missing”, completely missing that I would have to be the one to incorporate those features with my bare par coding and logical skills!

I compared KML features in freaking Google Earth (which I use to look at my home from ‘above’) with MSS Software, designed specifically for environmental scientists doing experiments on a plane! How, very smart of me.

My project focuses on designing a new UI with better features to support KML Overlaying in MSS Software. I have no experience in designing; heck, I don’t even have much of a hearsay in programming either!

Doubt, confusion, hesitation, procrastination.. all symptoms of imposter syndrome. Uff. ‘Its a fluke, you just got lucky, there were better deserving candidates…’ on and on in my mind.

The best thing to do is accept. Accept the situation. And then get even; get even with the odds.

For sure, its luck. I am lucky to be one of the 1200 odd students selected all over the world. Maybe they made a mistake in selecting me. Maybe I might let them down. But I have got this opportunity, and I am not giving up without a fight.

With newfound vigour and motivation, I started reading documentation and going through the codebase (not really getting it, but .. trying).

Test Coverage is an essential component of software development.

“All code is guilty, until proven innocent” — Anonymous

Basically, we cannot take at face value that the code is working. Maybe newer code might not be compatible with the previous one and you will have no way of checking it out. You cannot physically remember, each line, reason for writing it, and what is it supposed to do.

Sometimes, it takes one feature, but for me it usually takes one line of code to send my screen blinking red with errors!

That’s why we have test. Tests are automating the cases for which the code was written, and are run through frameworks like unit test or pytest.

So, the process is : You read and understand the code already written, you write your code, check if its functioning correctly (manually), you add comments, some documentation, and write tests to automatically verify if the code is working.

Some say its a good practice to write tests first, and then code later, so you know exactly what the code has to do.

The Side Project

Joern discussed an interesting side project with me: ‘Setup some tests for the proper display of KML, which we currently do not have (e.g. based on fuzzy image comparisons)’.

For context, tests are usually based on the deterministic internal state of software. You are testing the features and functions that you have written.

The output you are receiving, you believe it to be the right answer.
But, is it?

In my case, code has already been written to plot out KML Files. You get beautiful images on the maps on inputting a KML File!

But how sure are you that these are the actual images that need to be plotted? Is something else being plotted as well? Are there misconfigurations??
How sure are you, that the right colours are being displayed? (since different monitors have different Operating Systems, and they use different backends for drawing?

Truth is, you aren’t sure.
You are hoping that you are right, and praying that even if you are wrong, you haven’t messed up by a long mark.

Joern’s idea is to come up with a test which can save the current image as a .png file or so, and compare it fuzzily to a reference, and check for sufficient similarity. The check can either be manual or otherwise.

I love this idea! And it sounds like an exciting project for me to get back into OpenCV!

I spent the rest of the community period reading, researching, making notes on the following:

  • KML and OGC KML Specifications
  • KML implementing python libraries ( lxml, pykml & fastkml)
  • MSS Documentation, my Proposal
  • For my own sake, I went through some Python refreshers, pytest and more!

P.S. At the end of the community bonding period, I was given a branch to work on, a branch with my name!

(I’m not a nerd, I just love when I find my name attached to something good!)

Week 1 : ( June 1 to June 7 ) — Despair

Getting Started is always the toughest part.
For me, it proved quite easy. Or so I thought.

For the first week, I had two tasks:

  1. Adding Fastkml Support
  2. Solve Fastkml security issues caught by Bandit

I started with the 2nd task. It looked quite easy. Bandit is a security linter for Python i.e. it finds common security issues with code. I did a Bandit run on Fastkml, which threw up 6 Security Issues, and I was able to solve 4 of it. The other 2 require switching the whole fastkml library from lxml dependency to defusedxml, and required functions were not available in that library. Still, took me two days, since I get very wary of newer code, or .. code in general.

(Fastkml is hosted on Github, and MSS is hosted on Bitbucket, so I confirmed with the mentors that my work with Bandit will still be approved as GSoC work.)

I took Wednesday off. I had completed half of my work of the first week. And I had already seen and estimated that the rest of the work would be quite easy.

From my research, I could see that I just had to set the KML parser through fastkml library; maybe 2 lines of code to be written. And voila! I would have ‘added’ fastkml support.

I was so wrong.

Backdrop : The original KML parsing was done through pykml library, and the plotting on the map through Matplotlib. However, pykml hadn’t been updated for python 3 +, and so they had stripped it down to the bare essentials : lxml.

lxml is the most basic implementation of KML, XML , HTML or any other Markup language. It parses through the data and creates tree structures with nodes, containing data.

My plan was to switch it up with Fastkml. One library, One solution.

Why Fastkml? Fastkml, as advertised is supposed to be ‘fast’.

‘Fast refers to the time you spend to write and read KML files as well as the time you spend to get acquainted to the library or to create KML objects.’ — Fastkml.

That was one of the main reasons for selecting it. SimpleKML couldn’t parse data, pykml hadnt been maintained and well..fastkml says its fast.

Fastkml is built upon lxml. It uses lxml to create methods specifically related to KML, such as Placemarks, Geometry and Styles.I’ll be writing in detail about using Fastkml in a later post.

Roadblocks :

  • I had to deal with 3 libraries ( lxml, defusedxml, fastkml), all inter dependent on each other, which was a sure hassle.
  • Switching from lxml to fastkml was difficult. I had to understand fastkml’s codebase. There weren’t many examples to take inspiration from. I had to experiment through hours of trial and error to make one codeline of progress.
  • I faced some difficulty working with Fastkml because of an encoding (Unicode and raw bytes) issue, which took me a lot of time to figure out, but I did.
  • My previous notion that I only had to switch 2 codelines was unfounded by the fact that the lxml API returns an object which calls different methods. Fastkml does a lot of stuff in abstraction, and does not return any object. I had to make lots of tweaks, which made me feel like a fraud.

But, I was able to get through it, by giving the code enough time and thought. Weekends are supposed to be off, but I worked.

KML File plot in MSS Software, before I touched the software.

Before I started coding, everything was working. And then it was not. And the reason was me.

Week 2 : ( June 8 to June 15 ) — Hope

I didn’t talk to the mentors for 5 days straight.

What would I have said? That I have completely wrecked the code? I have no idea what I’m doing? Nope. First priority — damage control.

(Just so you know : no one can wreck the code at any time. The code in production is different, its protected. I had only messed up a small feature in a big software in the development branch made specifically for me. No big deal. But in times of crisis, it doesn’t feel like that, does it?)

Mentors were very empathetic.

‘It would be good, if you could upload your work regularly into a pull request, which we can review.’

‘…I would love to send a PR. But the fastkml support is causing me to change the whole kmloverlay file, and it’s not working completely.’

‘It is not necessary for the PR to be finished. We expect Work-in-progress, a WIP PR. See that more as a progress-report and an opportunity for us to give early feedback and tips and tricks to you.’

I sent the PR, and explained my difficulties.

I was feeling very low, simply because I had a perfectly working system in lxml , which I had changed on a whim to fastkml, and I could not get it to work. The second week had started and I wasn’t close to finishing my first week’s tasks.

Joern talked me through. He said he would have been surprised if I had completed the Migration in a single week. Told me that the work isn’t time rigid, just keep on doing. That was quite helpful, at the time.

‘I hope you are solving instead of introducing new issues with the migration. Prioritize implementing the improved UI with support for multiple KML Files rather than get stuck with the new library.’

I took the advice. Instead of getting lost in the labyrinth of features in Fastkml, I focused on getting the parsed KML to be displayed.

This is what I came up with, at the end of two weeks — And I was so proud of myself!

ididthis.png

I was able to plot a single point on the Map, using Fastkml implementation.

(The plotting is not wrong. The KML File coordinates really point to that Airport in water. Don’t ask why, I do not know.)

In Week #1, I had added Fastkml support to kml overlay. However, not all dependencies were working. I spent the week solving all the issues making sure that all KML data was being parsed and the right info was being displayed.

KML focuses on mainly 3 things :

  • Object (Document, Folder, Placemark)
  • Style
  • Geometry

and it was very tough to get them all working in sync. I ran through numerous KML Test samples to check the code, and made changes accordingly.

By June 15, I had brought the implementation using Fastkml to par with the current one.

P.S. I was surprised to get a mail from Christian Ledermann, creator of Fastkml Library, congratulating me on GSoC. He even added me as a collaborator! Beats me how he knew, though. Google Alerts, I guess :))

Discussions :

In our weekly chats, I discussed a lot of technical stuff, and along with it the following:

  • Knowing the Target Audience
    There are numerous features in KML for displaying elements. eg. In Styles, you can display Labels , Balloons, and additional features like bgcolor, text , description etc.
    From a Fastkml perspective, I can add all of them through predefined functions. But as a software , used by scientists, design matters less than ease of functionality. So, what should be my focus while working on my project?
  • Timeline & Flexibility
    With respect to my proposal, I am running a bit behind on my weekly progress. It is only because I am being quite careful in my implementation. I am regularly testing each feature. My coding speed is slow, but I am confident that I will catch up soon.

After a long discussion, it was decided that my timeline should be altered and converted to 5 goals. And the new timeline made sense. The mentors were very blunt in the discussion and gave me a lot to think about!

‘Most plans do not survive the contact with reality. Which isn’t an excuse for not planning at all, but one shouldn’t believe too much about projections into the future.’ — Joern

Week 3 : ( June 16 — June 22) — The Bug.

Discussions with my mentors are very fruitful! After a long talk, we decided that the most suitable step next is to add the feature of “Displaying Multiple KML files” together.

I was glad to start early on this part , since its the major feature of my project.I was supposed to start working on the UI for adding Multiple KML Files.

And then Christian messaged.

‘Hi Aryan, I just tested your implementation with a KML file from the flightpath of our research aircraft and I got an error:

self.patches.append(self.map.plot(x, y, “-”, zorder=10, **kwargs))
TypeError: plot() argument after ** must be a mapping, not list

The KML Files that I had tested my implementation on averaged from 10 to 20 lines of KML, consisting of a few placemarks, present as points, a line, or a polygon.

Christian had tested my code with a KML File spanning over 7000 lines of coordinates.

When I saw the file, my first thought was “ Gee, no wonder my code broke” :(

I sat down, saw where the code was breaking. Turned out to be a data structure issue. Now I could clearly see that my code was of use on a larger scale. Not 5–10 coordinates, but 1000s!

I rechecked my code, made more changes. Read the earlier code, made more changes. I went through it till I was sure the code would work correctly. Then I went for the test drive.

I ran Christian’s file through my code.

The orange- whitish plot
Same plot magnified 3 times

My perspective about my seemingly ‘small’ contribution changed. Significantly.

Working on the UI

I designed the new UI for the kml dock widget using PyQt Designer. It makes designing GUI items quite easy to design and incorporate within code.

Original UI
New UI : Added elements, previous elements are kept for functionality of code.

My design didn’t involve much of Design as compared to Ease of Access. I decided to keep the items ‘spacy’ and minimalistic.

Thankfully, my 2 PRs have been accepted, and I am working hard to finish my next WIP pull request :))

Week 4 : ( June 23 — June 29 ) — Making Progress

Sometimes, I get hyper. Not usually, but yeah sometimes.

I was researching so much online about PyQt and having fun, I redesigned the UI again.

Clicking on the ‘Add Multiple KML Files’ would open a dialog box.

My thought process behind this change was that if someone wanted to add multiple KML Files, they could simply click the button. For users with single file usage, there should be no change in existing functionality.

Clicking on ‘Add Multiple KML Files’ would open a fully designed UI Dialog box with easy to use features!

The UI that wasn’t.

When I showed the design to my mentors, well … they wanted a single User Interface. I did not feel good at the time, but the mentors are experienced, and are themselves users of the software.

I didn’t lose heart. Removed the extra UI, integrated into the original UI. I also added a ListWidget to store the list of KML Files! There were still problems with the logic, but I worked at it, and well.. take a look :))

Displaying 5 different KML Files simultaneously!

My first month ended on this high! Finally, from scratch to current implementation to the proposed project. I had come far. And I had learnt a lot.

Still a long way to go!

(Currently, I am progressing with my project in the second month of GSoC. I will write about the next part of my journey soon!)

Thank you for Reading!

--

--

Aryan Gupta

Individuality must prevail, if not Originality. (Twitter: @thodakaafi)