Sensei Stories: Tim Converse on Sustaining a Long Career in Machine Learning and Implementing ML Solutions that Empower Digital Creatives
AI and machine learning have developed at a rapid pace; now, these technologies are at the core of workflows, helping creators and business leaders automate mundane tasks and gain better control of processes at work. Few in the business have seen this progress quite so clearly as Tim Converse. He’s a senior director in the Sensei & Search team, managing the Applied Science and Machine Learning team. His team helps get innovative ML-based features into Adobe product, through a mix of novel ML modeling and tech transfer. It’s a position he’s been well prepared for.
Since 2004, Tim has worked in machine learning, at companies like Yahoo, leaving at times to join or even start startups in the space. His move to Adobe was special because it presented a new way of working with ML technology. “Although I’ve worked in machine learning for a long time, it was mostly targeted at social, search, and e-commerce applications, using “shallow” ML techniques on behavioral data,” he said.
“One of the pleasures of my job at Adobe is the sheer diversity of products and domains we get to work on with deep learning techniques: images, video, text, design, and soon, audio and music.” Read on for more insights from Tim into what it takes to forge a long and successful career in ML, and read his thoughts on what’s next for this field which, despite its unparalleled growth, still shows immense potential to further revolutionize the world around us.
What kind of hurdles have you had to overcome in getting ML technologies into digital products, at Adobe and elsewhere?
It often takes a mindset shift for product and engineering people to embrace ML solutions. They are used to being able to specify every aspect of how a software product works, and they can apply a very crisp standard to whether it’s working or not. When software isn’t working correctly, it’s a bug, and someone should get in there and fix it. That’s not always applicable to ML-based solutions.
This was the most obvious in previous jobs, when I was working on web search and e-commerce search. Someone in a product role might say ‘Hey, I did this query, and result number four isn’t very good.” The appropriate response is something like ‘OK, thanks for the feedback, we’ll look at it.’ The likely response to *that* is to hear “OK, so when will it be fixed? Will you get back to me when result #4 is fixed?”
And although it will make you unpopular, what you really should say is, ‘You know, we don’t fix individual search queries.’ That’s because there’s a nearly-infinite number of possible queries and results. It’s in the nature of ML that, when it’s working well, it will get things right nearly all the time, but never all the time. Sure, we can patch a particular query/result if there’s a huge PR or legal issue with that particular one. In general, though, we can spend our time either fixing queries one-by-one, or doing ML work that will improve millions of queries.
You still want to look at the bad examples, because they might indicate some systematic bug, but having some level of error is just intrinsic to the ML process. And, if you embrace it thoroughly, then you have to operate a step removed from the actual program behavior — not at the level of exactly what should happen, but at the indirect level of models and training data. What you get in return for this comparative loss of control is solutions that scale massively, both to many users and many situations.
How is developing ML solutions for creatives and creative workflows different from other types of digital products?
A lot of the work I’ve done before coming to Adobe is in ranking or recommendation. The division of labor between user and machine is really clear. The user’s not expecting that they will have to do the ranking, and the search engine is not doing anything that’s the user’s job. With creative tools, you want to create an ML feature that makes suggestions that are appropriate. You want to possibly create a feature that points out problems, to critique, to assist the digital creative. Also, if there’s something obvious, you want to create a feature to help the user complete their task as quickly as possible.
We’re focused on taking the drudgery out of creative work. Imagine you have to look at every photo, and change the colors to be brand appropriate on each. When it comes to these things that you have to do over and over, but that you wouldn’t expect a computer program to do for you, is such a sweet spot. These kinds of features that automate the uninteresting aspects of creative work hopefully free the user to do the things that are harder and require more human creativity..
What is challenging is determining this boundary of what’s in a creative’s domain; what you can do creatively in software without overstepping or misinterpreting user intent. I think it’s a much more delicate dance than with services where the contract is much better understood.
Often, walking this line means multiple team collaboration. It’s collaborating with the people working on the AI models, the people working on search engineering, those designing UX, and researchers extending fundamental capabilities. It comes with a lot of back and forth, as we all collaborate on the visual and UX expression of underlying machine learning work.
What’s your advice for others who want a long and healthy career in machine learning?
If you’re starting off now, there’s no shortage of ways to learn the basics of ML. This is particularly true for people who are already software engineers. Online courses, books, video lectures, explainer articles, and sample code in Github projects are all entry points. Learn Python, and then one of the modeling frameworks like TensorFlow or PyTorch, and try building models. Kaggle competitions are a good thing to try because they give you non-trivial datasets to work with. This can give you a flavor of what the work is like.
This ease of education and entry can easily be overstated, though. In this world of MOOCs, incubators, and bootcamps, you can get the false idea that there’s a fast lane that’s a replacement for having spent 10 years studying for a B.Sc. and then a Ph.D. That’s just not true. People who have done that have a strong command of all the math and theory that underlies these models, so there are aspects of that kind of work that a brief online course just can’t prepare you to do. But ML is cool in that there are ways to contribute all the way up and down the stack, and at all levels of difficulty.
The most important question to ask is whether you would enjoy the work. If, all of a sudden, crossword puzzle-solving skills became in-demand and lucrative, you might have people asking ‘Hey, how can I get one of these crossword-puzzle-solving jobs?’. The first question to ask is ‘Well, do you actually *like* solving crossword puzzles?’ Because if you get one of those jobs, that’s what you’ll be doing all day long. It had better be something you like, or you won’t be good at it anyway. The main activity in industrial machine learning is ‘futzing around’ and fiddling with data, tweaking programs, and writing not-very-interesting supporting code that makes it possible to get to the point of having an ML model at all. Some people find this work very boring and others find it incredibly absorbing, and so the first question you’d want to answer is: which kind of person are you?
What’s next for machine learning?
There’s been this huge explosion in capability from deep ML models that have made it possible to automate a lot of tasks, particularly those that start with low-level inputs like image pixels or words and letters in text, and draw high-level conclusions from them (just like people do routinely). It’s pretty effortless for humans to look at a photograph and name the types of objects that are in it and, for the first time, we have computer programs that can do the same thing, pretty reliably.
At the same time, the really hard problems in AI — judgment, commonsense reasoning, world knowledge, practical planning, linking different sources of knowledge together — seem almost as far away as they’ve ever been, despite the successes of deep learning models. It’s really interesting that deep-learned vision models can understand so much about the structure of images or the structure of texts, without any real knowledge of the physical or social worlds that produced them. A deep-learned model can easily identify all the possible dog breeds in images, for example, without having any idea that dogs breathe or run or eat or make good pets. There’s still a huge space left to explore convincing and useful deep-learned capabilities that still don’t assume these ‘AI complete’ problems have been solved.
All of these advances in computer vision are going to make it much easier to find things in images and video, and multi-modal fusion with language models is going to make it easier to talk to the computer about what’s in images and videos ( maybe at the same time that you’re making them).
Dialog systems are still pretty brittle but getting better very quickly, and the range of things you can have an extended ‘conversation’ about with a computer is growing. Image and video understanding will help automate away a lot of the drudgery of the routine parts of creative work, leaving the creator free to pursue their vision and/or get their work done faster. Language understanding will make it possible to ask questions of huge document repositories and get back real answers, regardless of how the questions and answers are phrased. It will be fascinating to see what the global societal impact will be when machine translation gets to be really good, cheap, and ubiquitous; when you understand news video from faraway places as though you were hearing it in your own language, for example.
Why are you so passionate about your work in machine learning?
It’s fun to see machines do something really hard, that you wouldn’t expect a machine to be able to do.
But it’s also because ML is such a force multiplier. In the early days of Apple, Steve Jobs would motivate people to work on improving slow boot-up times by multiplying the waiting time by the number of users and comparing that to human lifetimes. He would say things like: ‘You just saved ten lives!’ It’s similar when you ship an ML-based feature or improvement to millions of users — even if it makes someone’s day only slightly better or easier, the aggregate effect can be huge. It’s very satisfying.
For more on how Adobe is using cutting edge AI and machine learning technology to revolutionize creative workflows, head over to the Adobe Sensei hub on our Tech Blog and check out Adobe Sensei on Twitter for the latest news and updates.