Last week, Lightspeed Ventures and Fiddler Labs, came together to organize the first Explainable AI Summit in the Bay Area. We had an awesome attendance of business and technology leaders across industry and academia. We also had a panel of experts that drove the discussion, moderated eminently by the Lightspeed Venture Partner Jay Madheswaran.
- Anima Anandkumar, Director of ML Research NVIDIA, Bren Professor Caltech
- Nitesh Kumar, Head of Data Science, Affirm
- Krishnaram Kenthapadi, Tech Lead (Fairness, Transparency, Explainability), LinkedIn
- Peter Skomoroch, Head of AI Automation, Workday
- Krishna Gade, Founder/CEO, Fiddler Labs
The panel discussed a set of very interesting topics covering Explainability, Fairness, Data Privacy and Model Performance.
One of the main things that the panel agreed upon was that Explainability is one of the biggest blockers for Enterprise AI Adoption. The reasons why a company might need Explainability of ML models might vary:
- For some companies, it is a debugging tool to resolve customer complaints
- For some, it serves as a way to detect bias in the Models.
- And for others, it is a way to do performance analysis of ML models.
Over the years ML models are becoming increasingly complex and companies are using multiple layers of neural networks, ensembles of different models from decision trees to random forests. In many companies, it is not a single ML model at work in making a decision for the end-user, whether that is for approving a loan, showing a news recommendation or matching with a job opportunity.
Hence there is a dire need for tools that explain these complex models. However, the question remains that tools the built so far do well for simpler cases. For example, why is the ML model predicting the image to be a Dog?
Often times where people need Explainability is to debug or understand cases where the ML model does not perform well.
Explainability, Fairness, Performance and Privacy have tradeoffs with each other. Fairness is typically evaluated with respect to protected groups of individuals defined by attributes such as gender or ethnicity. And companies measure fairness in terms of differences in predictions, true and false positive rates across protected groups. For example if you’re building a gender classifier, a measure of fairness could be — is the model accuracy and coverage same across genders and skin tones?
As shown above, the model could be skewed towards one gender or skin tone. And this can be due to many things that can go wrong in the ML lifecycle starting with — defining the ML task at hand, collecting training data, engineering features, choosing the modeling algorithm, testing/validating the model and deploying it to production.
One can see Fairness and Privacy are at odds with each other because in order to make ML Models fair, one needs to know more about the user’s protected attributes like Age, Gender, Race etc, which may conflict with requirements of privacy regulations such as GDPR.
Similarly, one can argue that there is a similar tradeoff between Accuracy and Fairness because often times a highly accurate model could be unfair to segments of populations for which there has not been enough labeled data available.
While there seems to be a lot of interest in knowing “why is my ML Model doing that?” a simpler and yet more difficult question for Machine Learning teams today is “what is my ML Model doing?”.
A lot of times, an ML Engineer or Data Scientist creates a machine learning model, ships it out without having a good idea about the performance impact of the model. This is primarily because of the lack of tooling in the companies to do things like:
- Analyze performance of the models on various granularities of data.
- Run sensitivity analysis across the parameter space of the models.
- Manage versions of these models in production.
- Conduct A/B tests on several versions of these models quickly.
There was agreement from the panelists that the kind of tools that people use today from Jupyter Notebook to deployment systems for ML models are very old school.
As ML models become first class citizens, there is a great need for a new kind of developer stack.
One of the lingering questions that companies planning to adopt AI face today is about Ethics in AI. They read about the news of the Amazon’s recruiting tool or racist face detection algorithm and get concerned about adopting AI.
Ethics in AI is a very sensitive topic, and rightly so because as a society we are less tolerant of machines making a mistake than humans. For example there are lots of car accidents that happen due to human error every day, however when it comes to a self-driving car we expect a higher bar. Similarly, bias existed in humans for centuries, however machine bias is something that we can’t tolerate easily.
However, unlike human bias that has been hard to reason about or quantify or eradicate, AI systems provide us way to look into bias in a much more objective manner and find ways to address or minimize it.
We had a great turnout of 40+ people to the event and we would like to thank all the attendees for spending their valuable time and making our 1st Explainable AI Summit very engaging and informative. Special thanks to the staff at Lightspeed for providing us with the venue and staying up on a thursday evening!
If you’re interested in learning more about Explainability, and what Fiddler Labs is doing in this space, please send an email to firstname.lastname@example.org