CSR Tales
Published in

CSR Tales

CSR Tale #7: The importance of following your curiosity

CSR Tale #7 comes from Prof. Yehia Elkhatib, an Assisstant Professor at Lancaster University, and a visiting professor at École de Technologie Supérieure, Montreal. Yehia works on distributed systems and cloud computing, and is the creator and chair of the international Cross-Cloud workshop series. In this tale, Yehia talks about the importance of following your curiosity, even if it means your have to spend time doing side projects.

I say this to my team members, and I think it worth repeating every now and again: “8 of my top 10 cited papers are from side projects. Find what excites you, and do it well. I will support you.”

This is a story of one of my own side projects that I started working on in the summer of 2013. By that time, I was a post-doc with a couple of years experience under my belt, and I had worked on European and national projects. I might’ve been unlucky, but I found the projects that I worked on to be largely corporate-like enterprises with lots of politics and little exciting research. I was really eager to do something different, something more related to real technologies people use and not just some thought experiments that will only ever see the inside of a few meeting rooms then be shelved in an electronic bottomless pit.

I started to dabble with a number of ideas. I am fortunate enough to have in Gareth Tyson a former colleague who remained a close collaborator and friend even after leaving Lancaster for the big smog. We regularly catch up, and chat about different research ideas. For me, this is one of the first hurdles that helps me identify if an idea has something in it or not. My trusty sounding board, Gareth, is not just very knowledgeable, but is also brave enough to entertain unorthodox ideas and would tell me if something doesn’t hold water.

At that time, one of the tech articles on HackerNews caught my eye. It was about SPDY, a new protocol that Google was proposing to replace HTTP/1.1. I knew very little about how SPDY works, but the article caught my attention because it reported unexpectedly high improvement in page retrieval times. Those numbers stayed in my head for days, and motivated me to find more. I read all the online reports on SPDY that I could find, but was not convinced as no one really could recreate or challenge Google’s numbers. Eventually, I came across a blog piece that resonated with me. The author, like me, was also not convinced and did his own investigation.

This is when I decided to see if I could recreate these numbers. I was in the middle of a busy project for which I was the technical lead. I discussed with my boss whether I could allocate some time for this side project, and he was happy to. We agreed for me to spend no more than a working day a week, which in the beginning seemed like a lot. However, with more research I quickly started spending more time on the project. I would go home after work, spend the evening with my family then go back to the office for a couple of hours or more into the late night. My body would probably not allow me to do that now!

SPDY was gradually being deployed by companies other than Google. My initial in-the-wild tests showed nothing like the results reported by Google. This encouraged me to dig deeper, and try to isolate different factors that are out of my control (as a client) such as the number of page elements, server setup parameters, and, more importantly, the prevalent web admin practice of domain sharding; i.e. distributing page resources across multiple servers.

I created a small testbed in my office using unused and borrowed equipment. I needed to write a lot of scripts to automate experiments and to extract the required data. At times this was tricky, but overall enjoyable. I also needed to dive deep into web page retrieval, an area where I had relatively little experience. I found that the metrics commonly used by others to be unsuitable as they are geared towards evaluating the speed of loading a webpage into the browser. This is a little beyond what SPDY is, i.e. a network protocol. As such, I needed to come up with a new metric (Time on Wire) for measuring the speed of getting page contents to the browser without the time needed by the browser to render these into what the user views and interacts with.

Using this setup, I was able to control all web deployment factors and to determine the exact effect of SPDY on delay, throughput, and packet loss. The results were truly interesting. SPDY was able to significantly reduce delay as it makes multiple round trips unnecessary. However, SPDY falters in high bandwidth environments as multiple HTTP pipes easily outperform in such cases. The fascinating result, which no one else had reported on, is driven by the high sensitivity SPDY has to packet loss. As it multiplexes all communications over one TCP connection, all transfers are affected by any loss which closes down the TCP window.

I discussed the results with my co-authors, Gareth and Michael Welzl, a TCP expert I collaborated with in the past. We wrote up the paper in no time, and submitted to INFOCOM. Turns out, the rumours I heard about INFOCOM valuing theoretical analysis above all else seemed to ring true. One reviewer summarised their decision as: “The paper lacks theoretical analysis.”, despite the study being an empirical one carried out both in the wild and in a controlled testbed. We also got the usual reviewer who clearly did not read the paper: “This study only extensively tests three web pages which might not be representative.”, when we tested the top 8 Alexa websites with SPDY deployments and also tested this to death in our testbed. This is the nature of peer review, and happens to all of us, or so I hear :D

After the typical frustrating day or two, we decided to submit to IFIP Networking, a conference I have never been to but was familiar with from papers I read during my PhD work. We received the wonderful “We are delighted to inform you…” email which contained 3 accepts and 1 reject. The paper was well received, and fairly well cited especially for a side project done during borrowed time using borrowed equipment. However, the INFOCOM rejection was not just a source of the usual disappointment associated with rejections, but also resulted in us being scooped by a great team who did a very well executed work on the same topic, and published it in NSDI. Their findings were in agreement to ours, although they used a different methodology.

All in all, I learned a few valuable lessons from this experience. First of all, stepping out from your wheelhouse is quite rewarding. I followed my nose, but I ended up really enjoying the experience of looking into a timely subject, investigating how folks from industry and academia approached it, and verifying such results myself through rigorous testing. In particular, I relished building my knowledge around a fairly unfamiliar area; it reminded me of the early days of my PhD studies (in a good way!). This, of course, did not come at no cost. I dedicated many of my spare time to get this done, and this peaked towards submission deadlines. In hindsight, the time and effort it cost me were definitely worthwhile. Second, it is extremely valuable to have a reliable sounding board. Vocalising your ideas is already very beneficial, but having someone to pick your ideas apart and brainstorm through potentials approaches has incalculable benefit. Finally, I learned to upload my manuscripts to arXiv upon their submission. This allows you to plant your flag in the sand, and also gives your work early exposure. It has also encouraged me to keep a keen eye on similar papers being uploaded to arXiv. By doing this, in fact, I have later found a very strong and fruitful collaborator with whom I wrote joint papers and found good post-docs to hire.

I would like to highlight a few things in Yehia’s excellent tale. In my opinion, one of the things that separates great researchers from good research is taste: the idea that certain projects are more exciting than others, and that you should be working on this rather than that. There is an infinite sea of ideas, and it is possible to spend your time working on a project that even if fully successful, does not lead to exciting results. So following your nose and developing this research taste is very important, and is one of the key things to learn as a graduate student. Second, Yehia bemoans the state of reviewing, and this is something I’ve seen myself plenty of times. There is no accountability for reviewers in the system, so they get away with writing horrible reviews. This needs to change in the future if good science is to be encouraged.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store