I have no knowledge of Instagram doing this but one way it might be done is to listen to the audio, making a fingerprint of it (the same way Shazam does) and upload the fingerprint to the cloud. By “fingerprint” I mean doing things like a Fast Fourier Transform on the data and just uploading summary information. I think it is highly unlikely the app on your phone would be loaded up with English / French / Spanish phonetic models. (too big / CPU intensive)
I don’t think iOS apps can do audio processing in the background. They can record in the background but you would see the red bar at the top of the screen while that is happening. Also, you could connect your phone to a WiFi network that runs an SSL snooper and watch the traffic the app is sending to the cloud.
All of this is to say I think it is highly unlikely Instagram is doing this. It is much more likely, in my opinion, that other markers (like interest in other tech gadgets) triangulated the projector ad. Though I wouldn’t be surprised to see this sort of thing happening now in other (more CPU plentiful) environments like PC based gaming.