Writing about data analytics

A number of the big questions we face are in some way connected to data and data analytics. Not all, but many. We have seen some of this come to the surface over recent weeks. Little of what we do and few organisational practices go untouched by data. Data analytics now reach right into the structures in which we live. It struck me that in such a data intensive environment increasing power rests in the hands of those in a position to mediate and manipulate the circulations of data. These new types of knowledge need critical scrutiny. Lots of work in critical data studies actively does this, but there seemed to me to be an opportunity to focus a little more on those who mediate the circulations of our data.

Starting from this observation, over the last couple of years I’ve been working on a book about data analytics. The recently completed book, The Data Gaze: Capitalism, Power and Perception, attempts to open up data analytics to understand how data-led approaches have spread out across the social world. Like so many aspects of our complex media environment, it has proven a tricky thing to grasp. The label ‘data analytics’ is a big umbrella term that incorporates lots of different types of companies and practices. The problem was where to start and how to build a picture of these powerful mediators. To do this, the book looks at how data are envisioned, and how these visions connect into data infrastructures and practices.

I wanted to try to give a sense of the broader shifts around data as well as the particular details. To begin I simply imagined what I might do if I was part of an organisation that was seeking to expand its use if data. Using a few search terms I created a sample of data analytics providers of different types. I started to build up an archive of information about the industry — with one finding then leading to new lines of inquiry. By starting with the marketing materials I was able to build an impression of the sector. There were some surprises, which I explain in the book, but what I found was that it was possible to create a picture of how the power of data and data analytics was being imagined. This approach allowed me to see exactly what type of services or ‘solutions' were available. This exercise also made clear that there exists an extensive archive of materials about this industry and about data analytics. With that realisation, what started as a small side project mutated into something bigger.

Because I was looking at an archive that captured the changing relations between knowledge and analytical spaces, Michel Foucault’s The Birth of the Clinic became a source of inspiration. The way forward seemed to be to try to explore the analytical spaces and practices of data analysis. A small pot of funding was granted by my department to produce an archive of material on the Hadoop software project. Many of the analytics companies I’d been looking at used or provided variations on Hadoop, so it seemed a good place to start. This focal point produced a significant set of materials on what I have ended up calling the codified clinic (which is, perhaps unsurprisingly, very different to the clinical spaces Foucault described). I ended up with materials ranging from user guides, software update schedules and technical profiles through to user reflections, coding by-laws and even a merchandise shop selling branded vests. The above photo shows my attempt at categorising some of the documents gathered about the infrastructures — I attempted to try to group the materials to see if I could build some sort of narrative out of these disparate bits. I did this bit on paper, it seemed easier to look across and plan the discussions with this quite messy set of resources in this material way; plus this archive is populated with documents that might change or disappear, so the printouts gave me a more stable and permanent snapshot.

Alongside this, I also turned to some older sources on data mining and data warehousing. Together these sources gave some historical, technical and contextual details. They also led me to the role and practices of data analysts. These old texts gave all sorts of visual depictions of the roles that could be compared with the common infographic type visuals that are more frequently used in recent descriptions. I then compared these with various contemporary accounts of the work of the data analyst (as well as the autobiographical accounts I also found some of the training materials, career guides and professional programmes to be revealing sources). I discovered quite a bit of variety in this role that we might label the data analyst, so I began exploring the division of labour amongst data workers and how these roles were being demarcated. There was an interesting history to this, with the emergent roles being carved up as they became institutionalised. I found quite a few roles splintering in these new working relations — what Helen Kennedy has described as the ‘new data relations’ — but in the book I explore this demarcation or grid of activities whilst focusing directly upon the data analyst and data engineer.

Having followed the materials, picking up and pursuing what I discovered, I ended up covering the visions of data analytics, the analytical spaces and infrastructures and, finally, the roles and practices involved. The book explores how these three things interweave in the data gaze. Given the different types of sources I used to explore this, a key issue I had was with understanding the kind of technical jargon and insider terminology that I encountered. This created some issues as I attempted to decipher the various accounts, descriptions and outlines that I covered. Some of the images and videos took similar translational work — take the above diagram for instance. I found it an interesting archive to work in, with some revealing dynamics. Some of this turned into findings about the formation of knowledge and the defense of expertise. One thing I explore in the book is the emergence of the idea that anyone, with support of the software, can become a data analyst and what this then means for the status of the expert analyst. This tension became something I explored in more detail.

The materials I was working with told me about the technical side of data analytics whilst also revealing quite a bit about the way that knowledge forms and data are understood. Foucault gave me a starting point here, but I’ve ended up discovering a very different type of gaze to the one Foucault discussed. In the book I locate these differences in a changing clinical space, the rise of automation and the spread of the gaze beyond the qualified expert — among other shifts. It is the hyper-surveillant nature of a gaze that seeks an ever greater horizon, scope and depth of vision that I really try to explore. It seems to me that if we want to understand things like data-informed targeting or the way that data are used to shape our lives and behaviours, then the forms of knowledge behind those moves need to be examined quite closely.

The Data Gaze: Capitalism, Power and Perception is now available to pre-order (in paperback) either direct from the publisher or from Amazon or Wordery.