Vibe Camera: AI creates an impression

Capturing vibes, not photons

Jenna Fizel

Published in

Duct Tape AI

4 min readApr 28, 2023

Part of a series on prototyping with generative AI
This prototype was inspired by an idea from Danny DeRuntz.

A few weeks ago, Danny was talking to me about his thoughts around generative AI images, and how they are really more of a vibe than a specific depiction of real things. Wouldn’t it be interesting if we could capture the world around us as a vibe, as opposed to an accurate capture of photons hitting a sensor? The ways we use photography is under threat from these image generators. But what if the whole definition and practice of capturing a moment, maybe especially casual or personal ones, could shift from trying for accuracy to something else?

I thought this was very interesting, even just as an idea. What if we captured our lives in ways that are more similar to how our own memories work? And I also remembered scrolling through the OpenAI API docs and noticing the DALL-E 2 end points, including an option to create variations. And I thought, how hard would it really be to make one of these cameras? And so I tried. And, in fact, it’s quite simple! I did make one. I find it delightful. I’ve been using this camera in my life. I’ve used it at conferences, at work and in my home, when I got a tree for my birthday. The camera gives me feelings. I didn’t quite expect that, even though I thought this was a good idea. Prototypes like this one are things that are very recently easy to implement quickly. But I did not anticipate actually finding the experience of using it so moving.

comparison of a photo of a tree and a generated image of the same tree — My new tree

comparison of a photo of a crochet creature and a generated image of the same creature — A crochet mushroom creature

comparison of a photo of a lego building and a generated image of the same lego — A fishing shack Lego

Video demo of the camera app

How it’s made

Github repo

Obviously the most important part of the camera is actually translating from a conventional image to a generated variation. This is done using the OpenAI variation function from their JavaScript library. This is done from within a Firebase function that is behind authentication. There’s a little bit of infrastructure that goes into setting this up to make it secure and stable, including saving the returned base64 encoded image to a Storage bucket.

exports.getVariant = functions.https.onCall(async (data, context) => {
 if (!context.auth) {
 // check for authentication
   throw new functions.https.HttpsError(
     "failed-precondition",
     "The function must be called while authenticated."
   );
 }
 // grab the blob from the frontend
 const base_blob = data;
 // get the variation from OpenAI
 const blob = await openai.getVariation(base_blob);
 if (blob) {
   // save the variation to a storage bucket and return a url
   const url = await uploadImage(blob, crypto.randomUUID());
   return url;
 }
 return null;
});

It’s also important to properly format the image that you’re sending. And if you’re taking that image from a webcam, you might need to do some processing. The front end pulls off the metadata from the base64 encoded image, and then in this function it has to be replaced in a slightly different way.

async getVariation(blob) {
   // convert to buffer
   const buff = Buffer.from(blob, "base64");
   // add enough metadata to make openai happy
   buff.name = "image.png";
   // request a variation
   const response = await openai.createImageVariation(
     buff,
     1,
     "512x512",
     "b64_json"
   );
   // pass back the new blob
   const image_url = response.data.data[0].b64_json;
   return image_url;
 }

To capture the initial image there are pretty simple ways to open and select webcams. I’m using the mediaDevices browser API, and I’ve combined that off-the-shelf React webcam component.

useEffect(() => {
       navigator.mediaDevices.enumerateDevices().then((devices) => {
           const videoDevices = devices.filter((device) => device.kind === 'videoinput');
           setDeviceList(videoDevices);
       });
   });

Finally, I don’t want people to actually think about the realistic capture of their phone’s camera. That means manipulating the camera view in some way to obscure detail. So to do that, here, I’m using some CSS filters. At first I wanted to use SVG filters. That works fine on desktop, but fortunately/unfortunately, iOS will just turn those off because they’re not performant enough. And so I’ve limited myself to this primarily blurry view.

.camera-feed {
     filter: blur(calc($sz*0.05)) contrast(2.43) grayscale(0.72) hue-rotate(219deg);
   }

I have to be honest, I’m not fully satisfied with this. I’m not satisfied with this entire digital prototype as the end result. I think this camera really, really wants to be a physical object. And I am lucky enough to have coworkers who are really good at creating physical objects. Watch this space for updates…

Or, build on it yourself:

https://github.com/jftesser/see-the-vibe

Vibe Camera: AI creates an impression

Capturing vibes, not photons

How it’s made

Written by Jenna Fizel