I’m not sure if I get you right, but you seem to be assuming a convergence of teleology, a common end, only because you can impose an end on the visitor.
That can only happen a) if there is an already existing congruence of values; b) successful imposition of end-specific values
A) is the ideal pre-condition, as it allows a Gestalt effect attention direction. The values and their corresponding anticipations are already there, you only have to hint at the outline, the rest will be filled in by the attention of the visitor towards the center of pre-ordained gravity, the vanishing point of the narrative.
But this is the ideal situation, it might not be as common as we assume, or wouldn’t be desirable.
B) You won’t easily impose the narrative rules and narrative/story-specific values. There’s the risk of dragging your story (or narrative progression) from one exposition to another (and consequently making it boring/didactic/predictable).
I’m assuming that you will have to rely on natural perceptive cues. This is what the blink of an eye, the natural cut, does. It is after all a type of a blink of the mind.