SPRITE Case Study #4: The Case of the Quantum Taters

My whole life, I trusted potatoes. Until now.

In keeping with my Hiberno-Australian heritage, two cultures which hold the fried potato in a great deal of esteem, I have traditionally regarded chips as both delicious and somewhat mundane.

Inspires no fear.

So, it pains me somewhat to have to point out a case where potatoes seem to [a] alter their mass depending on their state of observation [b] remove and add live human beings to the space-time continuum and [c] provoke extraordinary and unexplained behaviour in those who come in contact with them.

Untrustworthy tubers. Under my nose this whole time, and playing these malicious Star Trek pranks on us. The person who invented mashed potatoes was probably trying to get some kind of psychic revenge.

The paper in question is called “Red potato chips: segmentation cues can substantially decrease food intake”, by Geier, Wansink, and Rozin. It is very popular, with 75 citations in its 5 years of publication.

APA website.

PubMed link.

I don’t know what these authors have done to annoy the potato. The theory actually seems quite pro-potato — in this experiment, they compare how many potato chips people eat when you dye them a different colour at regular intervals (it gives you an implicit record of how many you ate). You would think this was promoting less potato consumption, and hence quite a pro-tuber stance.

Maybe it was the red dye. Perhaps potatoes are virulent anti-communists, and the idea of every 7th one of them being made bright red in rigid ordered fashion was a step too far.

In any case, these starch-bastards proceed to maliciously warp reality.

Their rebellion starts with their description.

Potato chips simply do not weigh 11 grams. Most have a calorie density of about 5 calories per gram, which makes a 2g chip.

We can also demonstrate this with SPRITE — if these chips are 11g grams each, we can approximate the weight that the highest values in the control group (they were given a normal packet of chips and left alone to rip into them) would eat.

This group: mean = 45.25 chips eaten; SD = 14.02; n=19. 100 iterations gives:

Have you ever sat down and eaten three quarters of a kilo of potato chips?

No, me neither. Not unless I’m eating quantum potatoes, that is.

Confusion sets in immediately afterwards, as the quantum potatoes start to do their thing.

Apparently, there are both n=19 and n=20 participants in the control group. This is unlikely to be due to exclusions, because the overall N (59 participants; n=19 control, n=21 and n=19 intervention) is already presented.

Again, SPRITE lets us dig further — at the bottom of the above, the text mentions the maximum eaten (49) by anyone in the 7-divider and 14-divider conditions. These groups have means of 20.25 (9.26) and 23.68 (7.79).

A simple modification to the SPRITE procedure lets us have a peek at potential values here. There are two ways to do this, but the difference is trivial:

  • Remove the effect of the outlier on the mean, propose a new variance, and jiggle until you find a workable solution; or
  • preserve the value of 49 in the potential sample, and run SPRITE as normal — basically amounts to ‘make me a sample according to the mean and SD, but include one value of 49 in it’. Slightly better solution as it allows you to change the shuffling method as required.

Either way, here are two fairly unusual potential distributions which satisfy the sample parameters:

In other words, including that one hungry 49-chip-eating person in either group results in a truly strange pair of possible options for the intervention groups. Someone who didn’t have steak and eggs for breakfast?

The removal of this person (which might be totally justifiable, as we can get Grubbs’ test to flag it in some of these solutions) results in something truly ludicrous — for instance, in the example on the right above, including “possible group 2” in a one-way ANOVA with the other two groups returns a p-value of 10^-10.

(I hope I calculated that correctly… all the handy-dandy p-value calculators don’t work below 10^-4!)

Usually, you only see p-values that small in…

…you guessed it…

Quantum physics.

Anyway, these malicious arch-capitalist quantum potatoes really get their druthers in the second part of the experiment, and start blinking people in and out of reality like fireflies.

I can’t screencap this, I’ll have to lay the values out:

  • The overall N is given as 39.
  • The DFs of the ANOVA are compatible with N=38.
  • There are three groups reported with the following cell sizes: control (n=13), high-segmentation (n=14) and low-segmentation (n=12)
  • One person is excluded.
  • Later the high-segmentation group is given as n=11
  • … and the low-segmentation group is given as n=14
  • … and the sum of the high- and low-segmentation groups is n=23

In other words, the n of the two intervention groups is 26, 25 or 23. Even with total flexibility concerning where we put our excluded value, our potatoes are remixing reality like a bad club DJ.

SPRITE, again, allows us a closer look at this prospective data. The paper also states the overall consumption range was 6 to 51 chips. And all of it appears to be fine… the only problem is, at this point we don’t know the cell sizes.

(This is also why I can’t report GRIM errors here — as in, do any of the samples report means or SDs impossible for their sample sizes? The answer is maybe. because, as above, if the potatoes keep moving the participants around, we don’t know what cell sizes to match the means to.)

There are some other concerning non-SPRITEy points to make here.

One is the magnitude of the effect sizes reported in general. They’re all stunningly large, the kind which give us serious pause for thought — normally, we only see this kind of difference in much more controlled and mechanistic experiments. Social scientists often start paying close attention to whether or not something is amiss when Cohen’s d is 1 to 1.5... these values are about twice that.

Also, those effect sizes can’t exist. Cohen’s d doesn’t apply to ANOVA (unless you’re unaccountably using it on two groups)… are these eta-squared values instead, perhaps? Or, say, presented as a klugey kind of contrast measure between the control group and the averaged intervention groups?

(Full disclosure: I missed this completely on first reading, and Nick pointed it out immediately.)

All the F-values are off by a bit. Not much, but a little.

Some people were also initially concerned about how you wet-dye a potato chip and keep it intact/appetizing, but — if you have a minute — Richard Morey actually had a go at that one. Although I think in the end it was just an excuse to eat a whole packet of chips.


Verdict: at absolute minimum, very confusing. Far more questions than answers. Again.

And, finally: why go through all this? Because we were asked. This study is heavily cited and frequently taught, very much a current part of a public discussion around visual cues / food / behavioural modification etc.

But, it seems, quantum taters have their own ideas.

Tweedledees: Me, Nick, Jordan