Decisions in Big Data Analytics

Elaborating on Davenport Q&A in Strategy+Business

Looking Outward with Big Data: A Q&A with Tom Davenport” (March 31, 2014)

To come to understand Tom Davenport’s point about “looking outward” (beyond analytics itself) more deeply, we need to address the nested time scales and decisions involved in business analytics driven by Big Data (many terabytes to many petabytes).

1. Decisions about how to prepare data for analysis though constant checking, filtering, and improving the quality of the data for precision and accuracy (i.e., sources of noise and contamination): This is the shortest time scale for scientific decision making, and business experts typically won’t be involved in these decisions. This can be automated to some degree but human judgment will be required, if nothing else, to develop new algorithms and to improve them continuously based on learning about new data sources and new problems for which data are relevant.

2. The relatively rapid decisions required to “create thousands of models per week” involve hypothetico-deductive reasoning of quotidian science: This is not as rapid a time scale as ensuring data quality and integrity, and it depends on it. Scientists will lead this effort but some help from business experts will be required from time to time. This can be an area of innovation to the extent that intelligent agents can take over the workload of mini-experiments (on comparable models) conducted at a very fast pace but much slower than the rate of data flow (e.g., as in exploration of “unfalsified control” in adaptive control systems).

3. Decisions that constitute actual “use” of the data insofar as they translate the analyses into action: The rate limiting factors in these decisions are organizational processes for collective intelligence and decision making in a value network (e.g., “business sphere”) rather than constraints of Big Data per se. The computational power that enables Big Data, however, may increase the number and range of people involved in decisions about use of data. Presumably this is the case with P&G’s “decision cockpits.”

4. Decisions required to prepare unstructured data for analysis: These decisions evolve over the longest time scales. This is required to find meaning in unstructured data and to represent it in structured formats (“rows and columns”) for repeatable and reliable analyses. These decisions are supported by work that is arduous and often dialectically contentious. This work may require a novel kind of collaboration between scientists and business experts.

All these different kinds of decisions, and the associated time scales, must be addressed for business analytics (Big Data or not) to become scientific. The best science involves deep theory (if not meta-theory and dialectical tension) as well as hypothesis testing over time, including experimentation, in a diverse community of practice.

The source of theory (or at least initial theorizing) for business analytics may actually have to come from people who are not themselves scientists—people who have expertise about the relevant business offerings, the markets they intend to serve, and individuals to whom they provide value. That would be an interesting development in scientific community if not in scientific epistemology itself.

Gary E. Riccio, Ph.D. (March 31, 2014)