Mixing movement and machine

A behind the scenes look at building A.I. experiments with dance legend Bill T. Jones

Bill and dancer Vinson Fraley experimenting with a prototype during a workshop session

Where we ended: Body, Movement, Language

The product of our collaboration is a collection of PoseNet and speech experiments titled Body, Movement, Language: A.I. Sketches with Bill T. Jones. They are all inspired by Bill’s work interweaving speech and movement in performance, and are the direct result of his and his company’s engagement with this nascent technology. We also captured the process of creating these experiments in a short film.

The behind the scenes film

Where we started: The evolution of a single experiment

For our first workshop, I created this collection of web experiments to showcase the wide range of interactions PoseNet can enable. At the time, I expected it to be a starting point from which our ideas would grow more complex and layered. Instead, over the course of four workshops, we watched as Bill systematically stripped down each experiment until it contained only what he needed to convey a concept.

The team discussing how to evolve the current prototype for the next workshop
Body Writer (speak to attach words to your body) and Audio Controller (manipulate sound with a single body Point)
Text trailer: Use a body point to trail letters behind you (no person visible)
Christina Robson, dancer with Bill T. Jones/Arnie Zane company, testing the next iteration of Text Trailer
Vinson Fraley, dancer with Bill T. Jones/Arnie Zane company, improvising with Manifesto

Where you can start: PoseNet Sketchbook

An online collection of PoseNet experiments. Check out the repo on GitHub for installment and development instructions.
Two prototypes from the sketchbook: Basic and Movement Multiplier (grid mode)
  • It recognizes humans best in very pedestrian human positions as opposed to unusual forms, like a dancer with her leg by her head, for example.
  • It recognizes some points better than others: the nose, for example, tends to be pretty consistent — way more consistent than an elbow or wrist, especially if the person is moving around a lot.
  • Multi-pose works best if more than one person is expected to be in the space. Otherwise, it tries to make sense of multiple people as a single person.
  • PoseNet has no sense of depth, but you can calculate scale based on the distance between two points. I like to use the distance between eyes because people tend to be watching the screen so the eyes are consistently recognized points. When you get closer to the screen, the distance between your eyes becomes larger in the image, so you are able to use the value to scale any elements you want to respond to a person’s proximity to the machine.
  • Smoothing is useful, but only truly works when a single person is in the frame. Because PoseNet has no persistent knowledge of who is attached to each pose, it often returns poses in a different order.
Bill on stage at New York Live Arts during our last workshop, warming up for his performance.



This publication showcases collaborations with artists, researchers, and engineers as part of Google’s Artists + Machine Intelligence program.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store