What’s next for automated transcription?
We talked to Trint founder Jeff Kofman about where they’re at two years after winning Startups for News 2016 with their nifty automated transcription tool for journalists.
Trint recently teamed up with the Associated Press and they’ve rolled out a new extension for captions on Adobe Premiere Pro CC. They’re also currently working on a Translation Project, which has secured Google DNI funding. In our conversation, Kofman told us about his career change from reporter to startup CEO, how he manages people, and how to integrate translation into the workflow.
The interview was lightly edited for clarity and brevity.
GEN: How did you manage the transition from being a full time journalist to becoming a startup CEO?
Jeff Kofman: I wish I could say it was something I planned carefully, but it wasn’t. It just happened. After thirty years as a broadcast journalist, I decided to leave ABC News where I was London Correspondent. I wasn’t sure what I was going to do, but I wanted a new challenge. Through a chance meeting while researching a university journalism course I was building, I met some developers who had done an interesting experiment with automated transcription. I have always hated transcribing my own interviews. It was one of those ‘light bulb’ moments you see in a cartoon. I saw the potential to collaborate to build something really exciting and really disruptive. I had no idea how hard it would be.
How do you deal with the management of people?
Good question. As a reporter and foreign correspondent I had never managed people. It was completely new to me. In the early months I found myself struggling to understand how to empower and motivate our small team without stepping on toes. Through University College London, I took an intensive three-day course in leadership and management. It was a complete revelation. It gave me a vocabulary to understand how to set goals and delegate. It completely changed the way I operated.
What advice would you give to someone who was considering making the same career leap?
Choose a project that you really believe in. This is as hard as anything you’ll ever do, you’ve got to care about it and be excited by it to keep going. And force yourself to ask some tough questions: Is this a product the world really needs? Does it give people enough value that they will actually pay for it?
What does your typical day look like?
The three words I live in fear of most in an email are ‘let’s have coffee’. I love meeting people but I have to be really ruthless about how I allocate time. We’re growing really fast. We’ve officially transitioned from startup to scaleup. We have just opened our North American HQ in Toronto. We just hired employee #41 and we are recruiting for another nine positions. We should be a team of 50 by Christmas, which is incredible — we were a team of just ten in the spring of 2017. My days are crammed with meetings and really difficult decisions. I have zero business background so I spend a lot of time learning. Fortunately I am surrounded by a really strong management team, so I no longer feel like I am rowing alone.
How does automated audio transcription differ from automated video transcription? What are the different tools required?
We don’t differentiate between audio and video transcription. They both achieve the same end. The challenge is that as good as automated speech-to-text is, it makes mistakes. As journalists we can’t live with unreliable data. Trint set out to push automation as far as it will go and then give users a very simple platform to polish the machine-generated transcript to perfect. People tell us our solution is magic: we married a text editor to an audio/video player. We stitch the source audio to the machine-generated words on the screen. That way you can follow the sound like Karaoke: you can search it, verify it, and correct it if necessary. Users tell us we save 75%-90% of the time they would spend on manual transcription.
What does your partnership with AP look like? How do you work together? And how did the partnership come about?
The Associated Press is a really visionary news organisation. They recognise that they can’t possibly incubate the cutting edge innovations that will allow journalism to realise efficiencies and exploit the potential of technology. They carefully select startups to collaborate with, so that they can help shape the solutions. They understand that innovation is risky, takes time, and sometimes requires trial and error. They have the patience and foresight to accept these things even though they are a huge organisation. They are very demanding but incredibly good to work with. It reflects a very forward-thinking leadership that you don’t always find in legacy news organisations. We are now rolling out Trint to 600 AP journalists around the world in the first year. We are already integrated into their software, so the workflow is seamless.
Are we seeing significant advances in speech recognition and in what ways do you think it has improved? How has this had an impact on your product?
The underlying code for speech recognition goes back several decades. What’s changed in the last few years is the speed of computing and the storage capacity. That’s why we are seeing such amazing accuracy. It’s only going to get better. The challenge now is dealing with regional dialects and foreign accents, but that too will get better with time.
From your experience, is there a language that’s particularly difficult to work with?
Some languages are easier than others. Spanish is completely phonetic: every letter is pronounced. In French and English you have a lot of silent letters and sounds that are similar but spelled very differently (through, though, row, thou) so grammatical context is key. Did the boy do a summer salt? Or a somersault?
You recently won a Google DNI grant for the Trint Translation Project. Can you tell us a little more about it?
The Trint Translation Project (TTP) will add AI-powered translation to Trint’s AI transcription software, offering translation of content for fast, affordable, global distribution of news into 100+ languages. In both transcription and translation Trint gives users a first draft; Trint’s innovation is its workflow: our software makes it easy to search, verify, edit, share and collaborate with the machine-generated content.
There are 24 languages in the European Union and more than 7,000 languages across the world; yet 85% of the world’s population speaks the top 100 languages. TTP (Trint Translation Project) will create a single, unified platform to access translation for global distribution of multilingual content in the world’s major languages.
At present, translation is a time-consuming, expensive, fragmented process. A lot of steps are involved and a lot of resources are required, many of which are siloed and don’t integrate into existing workflows. Because of this, most people simply don’t bother translating their content. This hinders the potential dissemination of huge amounts of content simply because translation is an unfriendly process; millions of people around the world miss out on this diverse content as a result and remain closed off from anyone who doesn’t speak the same language.
That’s why we want to build TTP. Lots of content is limited by language, but TTP will provide a fast, streamlined, collaborative, efficient and affordable way to both access and share content in multiple languages while exponentially increasing reach.
And finally, was winning Startups for News 2016 helpful for you?
Early validation is really important for startups. When we won Startups for News we were still in beta and we were in very early revenue. The recognition helped set us apart and gave us an extra boost of profile and legitimacy. That’s incredibly valuable when you are struggling to get off the launch pad.