Great insights — I agree with them all, more or less.
I and a friend made www.presay.io as a simpler alternative — to mitigate the wrong choices companies take when choosing the right kind of test.
In my experience, there is great value in limiting options and possibilities — both for askers and for respondents. We are working with only 2 options per test (preference test), and a fixed question for the asker (“Which do you prefer?”).
In this way, tests are quick and easy to set up, and hopefully we will be able to mitigate some of the bias/confusion around choosing different modes of tests.
We are aiming at copywriters/bloggers right now, but wish to work more in the direction of designers/UX — by enabling test of images, illustrations, logos, fonts and so forth.
Your point on whether it’s the editor's responsibility to decide on headlines etc is interesting. Wouldn’t it still be useful to get some data on what the majority’s preferences are? Like a mainstream thermometer to inform a final decision?
Thanks again for the awesome observations — this is one of those articles, I keep returning to to remind myself of the issues and benefits around IX testing.
Have a good day.