Where there’s a Skill there’s a way: Piloting Box’s new AI feature in San Jose

In a world where our content is increasingly digital, it feels like searching for an exact piece of information should be simple. However, when was the last time you scrolled through your phone’s photos to look for a specific moment but had trouble finding it? While content is becoming more accessible, the volume has made searching tougher.

This past October, cloud storage company Box announced a new feature called Skills, which uses artificial intelligence to make content like photos and video “smarter.” When we saw this, we thought it had the potential to help residents and city staff more efficiently search the vast video archives of our city meetings.

We explain more below, but if you want to jump straight to the experience, here are our Box Skills-enabled videos: San Jose Smart Cities & Service Improvements Committee meetings.

Box Skills

Box Skills promo video

Based in Redwood City, Box is a cloud file storage and sharing company that went public in 2015. Their newly announced Skills feature brings machine learning and artificial intelligence technology from companies like IBM, Google, and Microsoft to content stored on Box.

There are three initial skills: image intelligence, video intelligence, and audio intelligence. For example, it might look at a photo from a summer picnic and identify the presence of people, a basket, and trees. It might listen to an audio recording of a customer service call and automatically produce a transcript.

The Pilot

The City of San Jose holds nearly 20 city council and committee meetings a month, ranging from 2 hours to nearly a day long. These meetings determine policy and give city staff direction, so what is discussed in them is vital to the operation of the city. However, finding exactly what was said is not always easy.

In our pilot, we save videos from our Smart Cities & Service Improvements committee meetings onto a city Box account. Box Skills then automatically processes the video to recognize who is speaking when, pick out topics that are discussed in the videos, and produce a transcript.

In 2016, San Jose City Council unanimously passed our Smart City Vision. One of the key pillars is to be a user friendly city, and we’re hoping this pilot can help us understand how we can make our City Council archives more accessible and usable. Another pillar is to be a demonstration city — “a laboratory and platform for the most impactful, transformative technologies.” Box Skills is still a pre-launch feature and not yet available to the public, so we’re excited to work with the Box team to pilot this new technology.

Note that the Box Skills experience is currently only available on desktop.

A quick guide to Box Skills

Here’s a quick guide to using Box Skills. When you open a Skills-enabled video, you will find a panel on the right side with a few different sections.

In the “Faces” section, you can hover over each face to see who the speaker is. Clicking on each speaker reveals a series of dots, each marking a point in the video when the person is speaking. Clicking on the dots jumps directly to that point in the video.

The “Faces” section of a video with Box Skills enabled

If you scroll past the “Faces” section (or if you collapse the section by clicking on the header), you’ll find the “Keywords” section. These are topics picked out by the artificial intelligence that Box Skills employs.

Clicking on a topic brings up a similar bar as before that marks each point in the video where the topic is being discussed. Clicking on a dot brings you directly to that point in the video.

The “Topics” section of a video with Box Skills enabled

Finally, at the bottom is the “Transcript” section. This transcript is automatically generated, and clicking on each block of text will also bring you to that point in the video.

The “Transcript” section of a video with Box Skills enabled

Next steps

We’re looking forward to hearing from you. Check out our Box Skills-enabled Smart Cities Council Committee meeting videos and then share your feedback with this 1-minute survey.

We also have lots of questions to answer. We’re looking forward to evaluating the scaling costs, compliance with city record-keeping policies, and whether we can meaningfully improve user experience of interacting with our video content. Stay tuned!


Thanks to the Box team (especially Jeff Wilfong, Audrie Plant, Sonny Hashmi, Leslie Higgins, and Katie Lee) and the city team that made this pilot possible: San Jose Director of Communications Rosario Neaves, Craig Justson, and Astra Kredel from City Manager’s Office; City Clerk Toni Taber; Chief Information Officer Rob Lloyd and Michael Dunn from IT; and Chief Innovation Officer Shireen Santosham from Mayor’s Office.