Doing Code Review Solo & Writing Good Code — Tips from #SRCCON 2017

Jennifer A. Stark
6 min readOct 18, 2017

--

Cyanide and Happiness

A typical week for me can involve collecting and wrangling data, running statistics, creating visualizations, and writing articles or blogs posts. I’m continually learning to code (mostly python) with a combination of formal and informal approaches, and when I worked in the Computational Journalism Lab at the University of Maryland, I was the only coder in the office who was not the boss.

After two very different experiences where coding errors lead to a lot of hassle, I got to thinking code review would be a good idea! One, an error in syntax, was a warning message about column:row selection that I had encountered in previous projects, corrected, and found that the output was the same as before. Therefore I assumed it was not important this time. My boss, Nick Diakopoulos, saw it by chance, asked me to fix it, and lo! The output was different! Thus followed a very stressful week where I had to work triple time to redo everything for a deadline. Ugh!

Another experience, this time an error in logic, involved code written by Nick the year before for a similar project. I just had to use his code for data collection. I read the code to understand what it did, but there was a section I didn’t understand and put it down my inexperience: “I’m sure it’s fine”. After publication it was brought to my attention that the data was a little odd. Neither of us could figure it out, so since I still didn’t understand how that one section worked, I took the opportunity to ask him about it. And that was the error: a piece of code that worked perfectly for the study it was originally written for, but that was not appropriate for asynchronous data collection. Then followed collection of a fresh dataset, analysis, and a correction of the published article. Thankfully, the error introduced noise, so after fixing it the statistics revealed an even stronger effect.

Implementing regular code review would have mitigated these undersirable situations. Many of my software developer friends do code review as a matter of course, but it might not be the case in journalism and academia. So what does code review look like?

Types of Peer Code Review

  • Full-on review — slow: You go through the code with someone line by line. They will ask questions and offer suggestions to improve the readability or modularity of the code, make it run faster, or fix mistakes, etc. This can take hours.
  • Sanity check / “smell test” — quick: You go through the code with someone focusing on “does it work?” and “is it doing what you think it is doing?” According to Daniel Trielli, the Associated Press in DC does this kind of review which can take about 30 minutes. I know that the data visualisation team at The Economist (London office) also does this kind of review.

This is great, but code review may not be available to everyone all the time.

Barriers to Getting Code Reviewed

There can be many barriers to getting your code reviewed, such as:

  • You are the only coder
  • Code review is not “a thing” in your newsroom or lab
  • The news cycle does not permit you and others the time for code review

If this is you, you might be asking:

  • How can I check my code for errors ?
  • Why should I need to? I already know how to science and how to code!

Let’s address the latter first: “Why should I need to?” A good coder should be able to write good working code all the time, right? Maybe. But I like to think of coding the same way we think about professional writing: whether you are a writer of articles, like a scientist or journalist, or a writer of fiction — or writer of something else — you will have an editor. Sometimes our overfamiliarity with the code before us prevents us from seeing errors in logic or leads us to ignore recurring warning messages. Other times we don’t ask for help or clarification for fear of looking silly, and cos, you know, “I’m sure it’s fine”. We should all review our code. Always. But if you code alone, how can you? To find out, I submitted a proposal to SRCCON for a session to discuss this very thing.

SRCCON!

The SRCCON session was great, with a mix of intermediate and well-weathered coders. We got into groups to share barriers, experiences, and strategies. Important discussions that I had not anticipated included how to deal with conflict during peer code review, and methods to facilitate code review (peer or solo) by writing code with a standardized style.

Code Review Strategies for Everyone

The following are approaches that were recommended during the SRCCON session that you can do by yourself.

  • “Rubber Ducking” or “Debug Ducking”: Have you ever grabbed someone to help you solve a problem, and while explaining the problem to them you figure it out? Well, this method replaces that person who just sat there listening to you, with a rubber duck: saves everyone time.
  • Write with someone in mind: This is deceptively simple. It gets you outside of your head and better able to see some of your blind spots if you write code in a way that person X can follow and use later. This will also encourage you to properly comment your code. Even if no-one else will read it, future you will appreciate it. I, for one, have come across errors while organising and commenting my code.
  • Reading good source code such as D3 source code (as opposed to code examples), can help you understand how the code actually works, and allow you to see quality syntax and layout (hopefully). Just as reading quality prose can help you write prose better, reading quality code can do the same.
  • Write code defensively to justify what you did: Though similar to writing with someone in mind, here you might imagine them questioning your decisions. “Why did you use that statistical test?”, “Why didn’t you control for X?”, “Why didn’t you atomise that function?”, “Why is that warning message there?” You might be surprised by how many things you do out of habit without critically thinking about it.
  • Linters & style guides: Linters highlight syntax errors, and style guides help you write standardised code blocks so it’s easier for you and others to review. Here are linters (not specific to Atom code editor) for a ton of languages. Airbnb also has a JavaScript Style Guide available. And here is an r/Python subreddit asking “which python linters to use.” Some packages do both style and syntax, so be sure to look around for what works best for you. If you have a favorite, let us know in the comments!
  • Writing tests as a way of understanding code: I’m not experienced in this, so if this is relevant to you, here is a detailed post on the subject that discusses how to write code that is testable, and why that is important.

Automated code review?

While I was reading around for additional tips, I came across automated code review:

I also came across Code Review Stack Exchange! (They really do have something for everything.) There may be more of these manual/automated options out there. If anyone has used any of these, and has opinions about them, let me know what you think!

Lastly, OpenNews and SRCCON!

OpenNews is the most welcoming organisation I have ever interacted with, and SRCCON — their tech conference, was no different. If any tech companies or conferences want to know how to encourage inclusivity, may I recommend OpenNews’ concise Code of Conduct.

I would also like to thank the NYT for my scholarship to attend — I could not had done this without their help! Thank you!

--

--

Jennifer A. Stark

Data science, data visualisations and data storytelling. Sometimes making tools. Information Visualization MPS, Neuroscience Ph.D.