Adobe Sensei Stories: Meet Kate Sousa, Data Scientist Constantly Improving Machine Learning Technology at Adobe
When working in machine learning, there are few things as important as data. Quality, quantity, and variety of data form the building blocks of a smarter way of working, powered by AI and ML technologies like Adobe Sensei. “It is often the practice to get data without knowing exactly where it comes from, how it’s curated, and other information that is key. This information is vital to completing an accurate evaluation of machine learning models,” says Kate Sousa.
Kate is a machine learning evaluation data scientist, working with the Sensei & Search team; this means she spends most of her time working on evaluation. She gathers data sets on machine learning models, and works with teams to evaluate how effective those models are before and after they’re deployed. In short, she is responsible for the constant iteration of Adobe’s machine learning efforts, always making them smarter and more effective for end users. We asked Kate to give us a peek into the world of a data scientist on the Sensei & Search team, and how she harnessed her background in health sciences to create an exciting career in AI and ML.
What does your work in machine learning at Adobe entail?
Part of my job is creating high-quality, labeled training data using crowd-sourced workers. I also create data sets for models developed both internally and externally that I use to complete evaluations. To do this, I meet with product teams, engineers, and other stakeholders to get a better understanding of what success and failure should look like when the model is deployed to different products. I then reimagine these requirements in a way that allows for me to rate the overall performance of a system by getting category labels from crowd-sourced workers. Based on the results from these data, I present my findings to stakeholders.
You didn’t originally start in data science. How did you end up working in machine learning?
I actually did not know that this is where I would end up. I created my own degree at UC Berkeley in Interdisciplinary Studies with an emphasis in health policy, human rights, and international development. I wrote my thesis on pandemic influenza in Mexico and pharmaceutical patent legislation.
I did not have internet access until right before college. Working in tech was not something I had even considered as a possibility. However, my undergraduate training had prepared me for the qualitative component of the work I do. My first job out of college was with an organization I was a member of at Berkeley called the Biology Scholars Program. It was at this research think tank where I refined my experience in research and also learned more about my burgeoning passion for data and data science. Because of this, I decided to attend the University of Washington Seattle Information School and earn my MS in Information Management with an emphasis in data science, analytics, and business intelligence.
I interned at Cisco Systems as an IT Analyst in the Big Data, Analytics, and Insights team within Marketing IT, which was my first experience in tech. I accepted a job with this team after my Master’s program and started working on developing evaluation metrics through interactive visualizations using big data. I ended up being the product lead for an internal tool used across the organization for reporting.
While thrilled that I was able to work on these projects, I missed being closer to engineering teams. The role I have right now suits me perfectly — I have the opportunity to learn more about the ways our researchers and engineers are using cutting-edge technology and techniques that are then adapted for our products and used by customers. In other words, I feel like I make more of an impact here than I have previously. It’s a truly exciting and engaging position to be in at any company, especially at Adobe.
Why is it so important to keep iterating and improving on machine learning models in this way?
As with anything that we try to accomplish, it is incredibly hard to get something right on the first try. We acknowledge that the goal for the first pass at something isn’t perfection, but instead to try to create something novel, challenging, and helpful for our customers.
If you constantly had the pressure of being perfect, nothing would get done. With that said, we can take what we learned in the first iteration of creating the model and try to make it work better. For instance, in some cases we notice that there are very specific fail cases that, if tackled, would definitely improve performance. It might be the case that we didn’t account for something in the first training set or that we didn’t think of a product need that we now know exists and could be addressed by adding another class or supplementing our training data.
At the end of the day, it’s about making a model better — there might be only two misclassified assets that a customer sees instead of eight — but I would argue that makes customers trust us more and get better use out of our products. That drives a lot of my work.
Does this also come with unique challenges in a fairly new field?
As with most problems, we tackle them through our own lenses of experience and knowledge. I am fortunate in that I am surrounded with incredibly diverse, knowledgeable colleagues that help address these problems. When thinking through problems like ‘what kind of human presence can be captured in a photograph?’ or ‘how many important ways can you characterize light in an image?’, I can take a first pass, but it usually helps to have a few people in the room also tackling these questions.
For me, this involves working with at least one product specific representative and my manager, Alex Filipkowski, who is a champion of this type of work at Adobe. His experience at tackling problems in this way has been invaluable to me.
Lastly, I think that encountering challenges in a new and burgeoning field can be really intimidating because there are no obviously right answers. However, this also presents a huge opportunity because many people in the field are trying to tackle these issues to get closer to a “right” answer. For the time being, that gives people doing this work a lot of freedom to refine how they want to tackle and solve these problems.
You come from a health background, originally. How has this helped your work as a machine learning data scientist?
Health is definitely part of my background, but I would instead claim to come from an interdisciplinary background. I have had to dive in and learn about other fields really quickly in order to glean the lessons that I needed for whatever I was working on. This skill has become invaluable to me and my work as a machine learning evaluation data scientist.
The Applied Science team works on quite a few different problems from different fields. I mostly encounter evaluation tasks for computer vision projects, but sometimes I will be asked to work on a natural language processing project or something that requires I rapidly get up to speed on a field that I have not studied in depth. As with my undergraduate training and early career experience, knowing how to learn this way and where to find the information I need to do it is essential and, luckily, part of the skill set that I developed over time.
Understanding what I don’t know and what I need to know has a lot to do with innate curiosity and the desire to learn. I strongly believe that these two traits are super important to workers in the knowledge economy. Colleagues and other professionals that I admire generally have refined these two traits; not only do they know their field incredibly well, but these skills have enhanced their ability to engage with others in a rich and inquisitive way that often adds to their own work.
Why are you so passionate about machine learning?
Prior to and over the course of my Masters program, I started teaching myself more about the field and the current work being done in various companies and academic institutions. What struck me the most was not only that many of the ways we interact with the digital world had already started leveraging machine learning, but that the physical world was also going to be dramatically changed by this field that has actually been around for quite a while.
There haven’t been that many technological revolutions over the course of human history. To be an active participant in one, even at a small scale, seems hard to pass on. Because this revolution is happening so quickly, there are a lot of unknowns and opportunities to shape not only the technology that’s being built, but the ways that people interact with these built experiences.
There is a lot of power in this and, simultaneously, a lot of responsibility. This presents a whole host of challenges that we could not have foreseen, but also a lot of opportunities. With all of this in mind, it’s hard to not be passionate about something that solves problems in novel ways.
For more on how Adobe is using cutting edge AI and machine learning technology to revolutionize creative workflows, head over to the Adobe Sensei hub on our Tech Blog and check out Adobe Sensei on Twitter for the latest news and updates.