Part II: What Would You Say… Ya Do Here?
We looked at frequency distributions to get a sense of how many different activities were indicated more than once, and how little I was actually responding to pings.
Now let’s actually lay out some statistical theory that will help us determine if these ping data are actually useful.
In statistics, a population is a complete set of items that share at least one property in common that is the subject of a statistical analysis. https://en.wikipedia.org/wiki/Statistical_population
As I mentioned in the first article in this series, I am looking to establish a baseline of what tasks I am doing throughout the day, week, month, etc., as a basis for improvement. So, mapping to the definition above, moments in time are the items, and the property they share is that an activity is taking place.
…[A] data sample is a set of data collected and/or selected from a statistical population by a defined procedure. https://en.wikipedia.org/wiki/Sample_(statistics)
The TagTime pings are the sample in this case. The selection procedure for the “ping time” is discussed by the authors of the TagTime introductory article, but suffice it to say that this is a random sample, and therefore unbiased. Except for…
Non-response bias occurs in statistical surveys if the answers of respondents differ from the potential answers of those who did not answer. https://en.wikipedia.org/wiki/Non-response_bias
As we saw in the last post, I have quite a few “non-responses.” Actually, 70% of my pings have not been responded to (that proportion is actually greater since taking this last snapshot of my TagTime logs, which might be the topic for yet another post):
> length(pings[pings$Activity == "nonresponse",]$t)
> length(pings[pings$Activity == "nonresponse",]$t) / length(pings$t)
After talking with a former colleague and statistical Tyrannosaurus, I confirmed that even a much less significant non-response would be grounds for dismissal of the entire data set.
But, it is useful to know when data cannot be used to draw conclusions; too often what amounts to anecdotal evidence is touted as a valid sample (e.g., basically any online opinion poll).
I have some ideas on how to increase my response rate, that I will attempt to add to the TagTime application(s) in coming weeks.
So, please stay tuned!
Originally published at hoffmanc.com.