It’s hard to buy an argument around familiarity as figuring out what button to press isn’t that hard, or at least not prohibitively difficult.
We’re all increasingly touch-interface literate and any learning curve is short – touch-based interfaces offer immediate feedback about the success or failure of a action being attempted.
Additionally, if you want to toe the familiarity line, I would argue that voice is extremely unfamiliar. The constrained method of voice input – that you have to say things in a certain way – coupled with the success/failure feedback lag makes voice-based interfaces generally fundamentally difficult to learn.
If familiarity in this argument is linked to the simple action of speaking versus pressing of a button, I would argue that this has it around the wrong way. Speaking is so familiar, so natural that having a computer able to comprehend the multiplicity of incarnations spoken word can take actually forces speech (familiar) to be unnatural (unfamiliar) when interfacing with a computer.
Like I said upfront, specific applications of voice can work. But paradigm shift? Unlikely.