Doing data from scratch — working with questionnaires

My main pastime these days is to hang around journalists and simply talk with them. The way they perceive information, how they mentally collate it and how they end up asking the questions that will move the story forward is what they offer — for my part, i try to see how their ideas can be tracked down as data and what queries can be formulated that will either prove our guesses or will give us new insights.

It was, therefore, an unexpected turn of events when that process reversed itself the previous week. While we all share some conception of the trends in some aspects of the median Greek citizen’s thinking, we decided that if we are to produce stories about it, we should have some verifiable piece of information to base our analysis on. The answer was obvious — create our own questionnaire and have someone do the field research for us, bring back the results and let us process it.

Designing the questions was the most challenging part so far. My thinking on such research has been greatly shaped by the Pew Research Center’s Political Typology report. That helped me look at the issue from (what i consider to be) a data perspective: since we’ll have all that data, it’d be interesting to see what clusters form naturally and try to superimpose that on the current political landscape, see what conclusions we can draw.

Doing that means we need to formulate the questions properly, so that not only will they give us interesting data individually, but also in a way that will allow statistical analysis to show correlations and produce clusterings that we can evaluate and write stories about. Again, Pew’s questionnaire proved helpful. We also took liberty with the language used in order to make them trigger more personal responses, since our main purpose is to understand the conflict between political correctness and actual belief.

I think of this activity as “from scratch” data journalism. While finding the data after the idea pops up is the normal course of things, i haven’t seen many people describe workflows which center around producing (as opposed to consuming) datasources. Crowd-sourcing is of course similar, but in our case we are not merely enlisting the public’ s help to clean up or produce a dataset —instead, we are extracting one from the public. In addition, our intention is to write stories based on the trends we discover, and not simply generate a report. To my mind, that puts us on the journalistic side of things.

I am curious to figure out how i can integrate in my workflow techniques for extracting requirements in the same way that they are used in software development projects. I find that they would be useful in two fronts — both in this particular example of creating questionnaires as well as working with my colleagues to understand their needs and eventually discover and properly query datasources that can give us insights to a story.

When the field research is done and we get back the results, i’ll provide an update on how the whole scheme worked. But i think my next subject will be about agile in our newsroom.

