Flutter
Published in

Flutter

How it’s made: Holobooth

A virtual photo booth experience showcasing Flutter and Machine Learning

Landing screen for the Flutter Forward Holobooth web app. On the left, Dash is taking a picture inside a photo booth decorated with purple and blue hues and the Flutter logo. On the right is a button to get started.
Try out the Flutter Forward Holobooth at holobooth.flutter.dev

The Holobooth builds on the first version of the Photo Booth app from Google I/O 2021. Instead of taking photos of you and Dash or Sparky, Holobooth uses machine learning to control animations of Dash or Sparky using your facial expressions.

We’ll break down how our team collaborated with Google to create a more immersive and futuristic photo booth experience by tapping into the power of Google tools. We used Flutter and Firebase to build the Holobooth app. Web ML in JavaScript allowed us to take the experience to the next level with virtual, interactive, 3D avatars. Let’s dive into how we built it!

Detecting faces with TensorFlow.js

A man with a grey shirt and glasses sitting in a chair. On his face are a bunch of red dots that map onto his features. There is a high concentration of red dots around his eyes and around his mouth.
Features detected with the MediaPipe FaceMesh model

Based on the position of each facial feature, we can determine if the user is in frame, if their eyes or mouth are open, and more. As the user moves around the camera view, the MediaPipe FaceMesh model (available via the TensorFlow.js Face Landmarks Detection package) ensures that we can track the exact coordinates of the user’s features so that we can mirror them on Dash or Sparky. For more details, you can dig into the face_geometry.dart file. While there isn’t an official Dart package for TensorFlow.js yet, the Dart JS package allowed us to import the javascript library into a Flutter web app (see the tensorflow_models package folder for more details).

  FaceGeometry({
required tf.Face face,
required tf.Size size,
}) : this._(
rotation: FaceRotation(keypoints: face.keypoints),
leftEye: LeftEyeGeometry(
keypoints: face.keypoints,
boundingBox: face.boundingBox,
),
rightEye: RightEyeGeometry(
keypoints: face.keypoints,
boundingBox: face.boundingBox,
),
mouth: MouthGeometry(
keypoints: face.keypoints,
boundingBox: face.boundingBox,
),
distance: FaceDistance(
boundingBox: face.boundingBox,
imageSize: size,
),
);

const FaceGeometry._({
required this.rotation,
required this.mouth,
required this.leftEye,
required this.rightEye,
required this.distance,
});

Animating backgrounds and avatars with Rive and TensorFlow.js

On the left is a face that moves to the left, then right, up, down, blinks, then opens its mouth. Dash mimics the same movements as the face on the left moves.
Move your face to see the Rive model mimic your behavior

The avatars use Rive State Machines that allow us to control how an avatar behaves and looks. In the Rive State Machine, designers specify all of the inputs. Inputs are values that are controlled by your app. You can think of them as the contract between design and engineering teams. Your product’s code can change the values of the inputs at any time, and the State Machine reacts to those changes.

For Holobooth, we used inputs to control things like how wide a mouth is open or closed. Using the feature detection from the FaceMesh model, we can map them to the corresponding coordinates on our avatar models. Using the StateMachineController, we transform the input from the models to determine how the avatar appears on screen.

class CharacterStateMachineController extends StateMachineController {
CharacterStateMachineController(Artboard artboard)
: super(
artboard.animations.whereType<StateMachine>().firstWhere(
(stateMachine) => stateMachine.name == 'State Machine 1',
),

For example, the avatar models have a property to measure the openness of the mouth (measured from 0–1 where 0 is fully closed and 1 is fully open). If the user closes their mouth within the camera view, the app provides the corresponding value and feeds it into the avatar model so you see your avatar’s mouth also closes on the screen.

Capturing the dynamic photo with Firebase

User selects Dash as an avatar, then selects an animated background of outerspace with planets and stars. A rocket moves diagonally up to the left behind Dash. The user selects a blue wizard hat with stars, a matching shirt, and a Flutter mug, then presses the camera button to record a dynamic photo. The final photo is displayed on a separate screen with buttons to share the photo, download it, or retake it.
Capturing the dynamic photo

To share your GIF directly to Twitter or Facebook, you can click the share button. You are then taken to the selected platform with a pre-populated post containing a photo of the first frame of your video. To see the full video, click on the link to your holocard — a web page that displays your video in full and a button directing visitors to try out Holobooth for themselves!

Holocard page with the first frame of a user’s dynamic photo on the left. Dash is wearing an astronaut suit in front of a futuristic city. On the right is the Flutter Forward event logo with the text “Check out my Flutter holocard” and a button that says “Try now” where users can take their own photo in the Holobooth.
Example holocard

Challenges and how we addressed them

Working with TensorFlow.js was a first for us at Very Good Ventures. There are currently no official Flutter libraries, so much of our early work on this project focused on experimenting with the available models to figure out which one fit our needs. Once we settled on the landmark detection model, we then had to make sense of the data that the models output and map them onto the Rive animations. Here is an early exploration with face detection:

A man wearing a light blue shirt and red hoodie is moving his face around the screen. There is a blue box around his face and red dots mapping onto his face, with a high concentration of dots around his eyes and mouth. As he moves his face, red dots move along with his facial features.
Early exploration of face detection

The official Flutter camera plugin gave us a lot of functionality out of the box, but it currently doesn’t support streaming images on the web. For Holobooth, we forked the camera plugin to add this functionality. We hope that this is supported by the official plugin in the future.

Another challenge was optimizing for performance. Recording the screen can be an expensive operation because the app is capturing lots of data. We also had to take into account that users would be accessing this app from different browsers and devices. We wanted to ensure that the app is performant and provides a smooth experience for users no matter what device they’re using. When accessing Holobooth on desktop, video backgrounds are animated and reflect a landscape orientation. To optimize for mobile browsers, backgrounds are static and cropped to fit portrait mode orientation. Since a mobile screen is smaller than on desktop, we also optimized the resolution of image assets to reduce the initial page load and the amount of data used by the device.

For more details on how we addressed these challenges and more, you can check out the open source code. We hope that this can serve as inspiration for developers wanting to experiment with TensorFlow.js, Rive, and videos, or even those just looking to optimize their web apps.

Looking forward

Take a video in the Holobooth and share it with us on social media!

--

--

Flutter is Google's mobile UI framework for crafting high-quality native interfaces on iOS, Android, web, and desktop. Flutter works with existing code, is used by developers and organizations around the world, and is free and open source. Learn more at https://flutter.dev

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store