CSR Tale #4: Yak Shaving with Michael Ekstrand
Our final CSRTale for 2017 comes from Michael Ekstrand, an assistant professor in the Department of Computer Science at Boise State University, where he co-leads the People and Information Research Team (PIReT) with Dr. Sole Pera. His research focusses on human-computer interaction and recommender systems.
In Michael’s words:
Yak shaving [1,2] is an established concept in programming mythology. You’re at the zoo, shaving a yak, because through a chain of disturbing causality it’s a prerequisite task to fixing the hole in your living room wall.
But sometimes, your yak razor is dull. Fixing that involves another five steps, so you’re at your cousin’s farm shaving an alpaca so you can shave the yak so you can fix the wall. (Overanalyzing this metaphor is likely to lead to pain.)
My paper with Michael Ludwig, Dependency Injection with Static Analysis and Context-Aware Policy, is the result of an alpaca shave. We didn’t want to write a dependency injector. My adviser didn’t want us to write a dependency injector. We tried very hard not to write a dependency injector. But at the end of the day, we wrote a new dependency injection container for Java so we could do better HCI and machine learning research. And we published it in the academic literature.
The origins of this project started with my first conference trip as a Ph.D student. One of the presenters was comparing their new recommender algorithm to a well-established approach on a standard data set. My adviser pointed out to me that we could tune the established approach to produce better accuracy than the authors were claiming for their new one. I thought this situation was not good, and with my adviser came up with something to do about it: develop state-of-the-art, open-source reference implementations of classical recommender algorithms and evaluation techniques, with documented tunings for common data sets, to reduce the difficulty of comparing against best-practices baseline algorithms so researchers have less excuse for comparing against poorly-tuned implementations of unknown quality.
Thus LensKit, our recommender toolkit and the large yak-shave that ended up driving my dissertation, was born. We published a conference paper (in RecSys 2011) on the toolkit itself, and it drove most of my later papers, my dissertation, and quite a few other papers and theses from our research group and others. I’ve continued to work on it, now with my students, and it makes a lot of our current research possible. Quite a few of its features and design decisions have been driven by the need to make something possible for some new research paper.
But that’s just the first layer. We wanted LensKit to be thoroughly reconfigurable: not only to implement the state of the art, but to allow the algorithms to be configured to reproduce a wide range of prior work. Early on, we settled on dependency injection, combined with aggressive decoupling of individual algorithm components, as the most feasible means enabling this in Java. But we also wanted to be able to automate some of the difficult work of integrating machine learning models into web applications: allow developers to annotate components such as statistical models as ‘pre-buildable’, and have the system automatically pre-build all possible components to share between different web request handlers.
No existing software made this easy. We could not ask Guice or PicoContainer to build us a plan, let us modify it to extract the pre-buildable components, and then instantiate that plan. We tried both of them, and got LensKit mostly working, but it was full of ugly kludges and didn’t support some of the configuration possibilities we required.
So I went to the markerboard with Michael Ludwig, an undergrad research assistant in our group and one of the best software engineers I’ve had the privilege of working with, to brainstorm what we needed to make LensKit do everything we wanted it to do. We realized that we could describe our requirements in terms of graph manipulations, and that a graph was exactly the structure we needed to analyze which components were precomputable.
We defined the core configuration requirements in terms of these graph operations, Michael wrote acceptance tests and a first implementation in a couple of weeks, and LensKit was soon working better than ever. Several revisions later, we have an injection container that passes the JSR 330 compatibility tests, exposes an object graph for analysis and visualization, has a configuration mechanism that is strictly more flexible than that provided by other common containers at the time, and does all of this in 6K lines of well-tested code. So that we can study human interaction with recommender systems and promote reproducible recommender systems research. Yakception.
We thought that our graph-based formalization of dependency injection might be an interesting research contribution, so we set to work trying to write up a paper, and consulted with Eric Van Wyk about where it might be publishable. Dependency injection, it turns out, is not studied very much in the academic literature on object-oriented programming. There are several pragmatic treatments in the commercial technical press, but little research on how to support it effectively or the tradeoffs of different approaches. Our formalization seemed like a promising starting point for such research, and we developed a proof that our object configuration system was strictly more expressive than the competition.
We sent the first version off to OOPSLA/SPLASH, and were rejected. It was a just reject; we didn’t understand one of the competing technologies as well as we thought, and this was our first attempt publishing in programming languages & OOP, so we didn’t know the language or key references as well as a native would. The reviewers also had several suggestions for restructuring the argument.
We revised and restructured for ECOOP, and were rejected again with good criticisms. It felt like we were getting closer, so we revised again for another SPLASH attempt, and again no dice. At this point we started getting more frustrated, because it felt like addressing reviewer comments wasn’t making a difference: we would fix something, and have the same thing critiqued again in the next round of reviews. The paper also had a very different structure and set of emphases from its original version, and we didn’t think that communicated our vision as well. I also needed to finish up my dissertation and do a job search, so we let it rest for a bit.
A year or so later, we decided to try again, but to give up on conferences and try for a journal. Journal of Object Technology looked like a perfect fit for this kind of work, so we revised and extended the paper. The journal format gave us space to present a more thorough treatment of both the graph model and our proof of expressiveness. We received a Major Revision decision, which included the most helpful negative review I have ever received. R1 saw the promise of what we were doing and gave us extensive suggestions to improve the paper. Ironically, or perhaps vindicatingly, this reviewer’s suggestions brought us back to roughly the same outline as we had in the original OOPSLA attempt, but with a more thorough treatment.
Lessons and Advice
- Try everything you can before building everything yourself.
- Test early and test often. Even for research code, writing good unit tests and acceptance tests saves a lot of time and stress worrying about whether your code is really right (or embarrassment later when you find it isn’t). It also let us know when the code was ready for use in the main project: the acceptance tests checked for correct execution of use cases most important to LensKit.
- When publishing outside your core area, expect early rejects. Also consider going for a journal unless you really want to visit the conference — Major Revision cycles are easier to navigate than retry-and-try-again.
- Ask people for help. Eric Van Wyk at UMN helped us think about publication venue possibilities.