The uncanny valley of video editing

Kevin Marks
3 min readJun 3, 2015

--

I’ve left my 10 years of family photos and video uploading to Google Photos over the last few days, as another backup is always a good idea.

I’d seen some of the automated edits before, as part of Google Plus, and been amused by glitches like the inverted snowscape

and assuming GPS from one device applies to photos from another camera,

but this added two new-to-me features. One was automatic image tagging and clustering, which decided punts were canoes and poodles were birds

(and that my sons’ prom photos looked like a wedding, which is more understandable).

The facial clustering is an attempt to be less creepy than before — previously Google Plus would encourage you to identify people in your pictures and then put your photos in their timelines. Now it just clusters by similar faces, and doesn’t ask for names which works pretty well, apart from the “glamourous and high cheekbones” cluster that grouped a statue of Akhenaten, a caricature of Jarvis Cocker, Conchita Wurst, Twiggy and Audrey Hepburn.

The thing I wasn’t expecting is what it did to the video files it uploaded. I should explain that as a proud father, one of the things I have done is video my sons’ piano and percussion recitals and roller hockey games. For the music recitals I clearly wanted to record their performances. For hockey, the motivation was more for Christopher to be able to review his play after each match so he could improve his technique.

What Google Photos seems to do is that if you give it a lot of video shot on the same day, it makes a little highlights reel with library music under it. This doesn’t really work for lots of hockey footage.

Now, an actual highlights reel that found the goals and good tackles would be useful thing. I wonder if they could listen for cheering and roll back to get a goal:

The really odd thing was what it did to the music recitals. Here’s a piano playing montage:

and here’s one from percussion ensemble:

It feels like Google’s AI is mocking me.

Now I’m sure that there are lots of other parents who video their childrens’ public events (there’s usually a row of us with cameras and phones out). What if Google really could edit them well? What if they could combine footage from multiple people at the same event, or even combine the sound recordings with video recordings?

Perhaps that would be too intrusive, but by using audio as well as video I bet there is something more interesting to extract.

--

--