Respoisible AI ML by Microsoft — Microsoft Machinel Learning Foundations

A Deeper Dive: Tools for Responsible AI from Microsoft Viewpoint

9 min readSep 11, 2020

Machine Learning (ML) and Artificial Intelligence (AI) are becoming deeply integrated in our society. This trend will continue in coming decades because AI is maturing. Today, processes, decisions and forecasts are supported by AI technologies not so transparent at occasions. In past decade there has been rising commitment by all key players in the field of ML and AI, to provide frameworks for increasing accountability of ML and AI applications outcomes. Responsible AI/ML encompasses values and principles laid out by Microsoft in order to empower data scientists and developers to innovate in a sensible or trustworthy manner. At its’ core the ML/AI platform, process and models must be trustful. Machine learning at Microsoft encircles the following foundations and standards:

Understand
Protect
Control

Understanding your ML models, their behavior explanation and interpretability and unfairness Assessment and mitigation is the first pilar.
Protect people and their data, use differential privacy to prevent data exposure and utilize homomorphic encryption to work with encrypted data is the second pilar.
The third pilar is control end-to-end machine learning process, use datasheets to document the ML lifecycle and keep audit trail.

Unintended consequences of ML and AI technologies should be anticipated and mitigated. Deployments of such technologies are more and more influenced by the lack of end user trust in the transparency, accountability, and fairness. We are witnessing increasing number of incidents where users experience real harm, so addressing possible undesirable outcomes is a must.

Therefore it is critical to provide understanding for model behavior, protect people and their data and provide control over development process. Let’s dive deeper into each of foundations.

Understanding your ML models

The first pillar is understanding answer to the question of do we really understand why this particular model is predicting the way it is? The ability to understand model behavior includes interpretability and unfairness assessment. For this Microsoft offered out-of-the-box visualizations to facilitate explanation of model behavior, and detect and mitigate unfairness. Model interpretability enables developers and data scientists to understand model behavior and provide model explanations to stakeholders and customers. Prioritizing fairness is a challenge and leveraging specialized algorithms should ensure fairer outcomes for everybody.

Protect people and their data

Protecting people and data is achieved by applying differential privacy techniques to protect sensitive data and prevent leaks is the second pillar. Maintaining confidentiality is achieved by encrypting data and by building models in a secure environment. These techniques can be used to help organizations build solutions while maintaining data privacy and confidentiality. Differential privacy prevents disclosure of private information, without significant accuracy loss. Using homomorphic encryption, cloud operators will never have unencrypted access to the data they’re storing and computations are performed directly on encrypted data.

Control machine learning process

The third pillar is control of development process. It should be repeatable, reliable, and hold stakeholders accountable. Use built-in lineage and audit trial capabilities and enable responsible process by documenting model metadata to meet regulatory requirements. Microsoft provides capabilities to automatically track origin and maintain an audit trail of ML assets. Datasheets provide a standardized way to document ML information such as motivations and intended uses.
Details of ML assets are captured in central registry, which allows organizations to meet various audit requirements. Datasheets are a way to document machine learning assets that are used and created as part of the machine learning lifecycle. Model information may contain architecture, training and evaluation data, performance metrics and fairness issues.

The Six Principles of Responsible AI

Microsoft’s Responsible AI journey began with establishing six key principles to guide development and use of AI, which are outlined in the book The Future Computed. These six ethical principles should guide the development and use of artificial intelligence. Microsoft further foresees the evolution of laws, the importance of training for new skills, and even labor market reforms.

Trustworthy AI design requires creating solutions that reflect ethical principles and timeless beliefs. These guiding AI design principles are critical to addressing the societal impacts of AI and building trust as the technology becomes more and more a part of the products and services that people use at work and at home every day. These core principles are:

Privacy and Security — AI systems should be secure and respect privacy
Accountability — People should be held accountable for AI systems
Reliability and Safety — AI systems should perform reliably and safely
Fairness — AI systems should treat all people fairly
Inclusiveness — AI systems should empower everyone and engage people
Transparency — AI systems should be understandable

Tools for Responsible AI

Along with six foundational principles for responsible AI, Microsoft developed or adopted tools and offered resources to identify and design against potentially harmful issues. The aim of this section is to help understand purpose of resources and give some examples so developers and data scientists can deepen research even more.

This is a complex domain that is developing with the aim to provide environment for building Responsible AI. What are the tools Microsoft viewed as fundaments in supporting its responsible AI principles?

A Deeper Dive

Each of six principles of Responsible AI according to Microsoft is supported by some combination of proposed resources.
Tools and resources we are talking about here are classified into following types:

Guidance papers
Open source code
Technologies
Tools
Further training

With guidance papers, Microsoft aims to offer advice or information to resolving a problem or difficulty. Guidance is a reference usually compiled by someone in authority. Open source code is shared through GitHub and usually contains code and discussion on certain approach. Technologies mostly refer to everything available in Azure Machine Learning ecosystem. Frameworks and methodologies are referred as Tools. Further training is consisted of webinars keynotes, and all other resources added daily.

It’s important to make sure AI works well for everyone. In order to accomplish that the use of guidelines, assessment tools, mitigation techniques, and continuous monitoring is recommended. Finally, tools for transparency and accountability help ensure that AI models are adhering to other ethical principles.

Let us dive deeper into the tools Microsoft viewed as fundaments in supporting its’ Responsible AI principles, organized into sections by resource type.

Type of resource: Guidance Papers

“Securing the Future of Artificial Intelligence and Machine Learning at Microsoft” offers guidance on how to protect algorithms, data, and services from security threats specific to AI and ML. This paper shares potential remediation to these emerging engineering challenges.
“Fairness and Abstraction in Sociotechnical Systems” is the paper from the ACM Conference on Fairness, Accountability, and Transparency. Here you’ll find how to avoid five most common pitfalls of fair-ML related work.
“Counterfactual Fairness” paper from Cornell University offers framework for modeling fairness using causal interference. Causal inference is a statistical tool that enables AI and machine learning algorithms to reason in similar ways.
“Inclusive Design toolkit and Inclusive Design Practices” are two references to help data scientists and developers learn how to understand and address unintentional issues in a product environment that could exclude people.
“Algorithmic greenlining” is Microsoft research paper offering an approach for developers or decision-makers to improve selection criteria producing high-quality and diverse results in different contexts, i.e. mitigating gender issues.
“Datasheets for datasets” is a paper that encourages people preparing training datasets to maintain a datasheet with key information. Content of datasheet can be motivation, structure, collection process, and recommended uses for data and model. Datasheets can potentially increase transparency and accountability, mitigate undesirable biases, provide greater reproducibility of results. It also helps researchers and ML practitioners select more appropriate datasets.
“About ML” is an initiative of many stakeholders to develop, test, and promote best practices for machine learning documentation. It includes systems’ blueprint, its’ purposes, where the data came from and why it was chosen, how the model was trained, tested, and adjusted. It also explains purposes they’re not suitable for. The initiative is overseen by Partnership on AI (PAI).

Type of resource: Open Source Code

“Microsoft SEAL (Simple Encrypted Arithmetic Library)” explains implementation of the Homomorphic encryption library. This type of special encryption technique allows users to compute on encrypted data without decrypting it. The results of the computations are encrypted and can be revealed only by the owner of the decryption key.
“Fairlearn” is an approach shaped by Microsoft Research and Microsoft products teams. It can be used to assess the potential unfairness of ML systems making decisions about opportunities, resources or information. Fairness is a sociotechnical challenge and its’ tools are appropriate in limited circumstances. A Python implementation of this approach is available on GitHub.
“AirSim” is a valuable open-source Microsoft tool for improving simulated training environments. AirSim is a simulator built on Unreal Engine. It is cross platform, and supports software-in-the-loop simulation with popular flight controllers.
“InterpretML” is used for training interpretable models and explaining black-box systems. It is an open-source package created by Microsoft Research and implements a number of models. It also supports several methods for generating explanations of black-box model behavior or predictions.

Type of resource: Technologies

“Azure Confidential Computing” solution enables developers to take advantage of Trusted Execution Environments (TEE) without having to rewrite their code. Consistent API is used for developing of enclave-based computing apps. The code and data are protected against viewing and modification from outside of the TEE. Introducing multi-party computation (MPC) allows to share data while preserving input privacy and warranting that no party sees information of other members.
“Differential privacy (former WhiteNoise)” is a technology for training ML models using private data. It uses random noise to ensure that the model output doesn’t noticeably change when dataset changes.
The “Data Drift Monitoring” feature in Azure Machine Learning detects changes in data that may cause degraded prediction performance. It enables developers to maintain accuracy of the model.
The “Model Interpretability” feature is used to explain why a model makes certain predictions. It can be used to debug the model, validate behavior matching objectives, and check for bias.
“Microsoft MLOps” is DevOps feature in Azure Machine Learning. MLOps makes it easier to track, reproduce, and share models and version histories. It offers centralized management throughout the entire model development process, and helps teams observe model performance by collecting application and model telemetry.

Type of resource: Tools

“The PSI (Private data Sharing Interface) Tool” uses differential privacy enabling researchers to explore and share datasets that contain private information. Tool is developed by Harvard researchers.
“Methodology for reducing bias in word embedding” helps reduce gender biases by modifying embedding to remove gender stereotypes. The methodology is created by Microsoft Research.
“Pandora” is a debugging framework designed to identify reliability and bias problems within machine learning models. Developed by Microsoft Research it uses interpretable machine learning techniques to discover patterns and identify potential issues.
“AirSim” is as stated before valuable open-source tool for improving simulated training environments, and falls in both categories of resources tools and open-source.

Type of resource: Further Training

“Webinar on Machine Learning and Fairness” helps you understand the unique challenges regarding fairness in ML. You’ll learn how to detect and mitigate biases in your development and deployment of ML systems.
The “NIPS keynote” (2017) will give you more on how organizations should approach assessing the fairness of their AI models.
“Responsible AI” resources are designed to help you responsibly use AI at every stage of innovation — from concept to development, deployment and beyond.

Conclusion

Responsible AI/ML encompasses values and principles laid out by Microsoft in order to empower data scientists and developers to innovate in a sensible or trustworthy manner. Three pillars of understanding, protection and control are laid to facilitate building trust in AI and ML technology. Guiding AI design principles of privacy and security, accountability, reliability and safety, fairness, inclusiveness and transparency are critical for mitigating negative societal impacts of AI. Machine Learning is a dynamic domain and new resources are added daily. In order to help data practitioners and developers build AI responsibly, Microsoft offered own resources and adopted valuable contributions of other groups and institutions. The classification proposed here can be used to maintain your own resource lists, not only for Microsoft related initiative for responsible AI but for other providers and key players. Accents on Microsoft views were intentional. An inspiration for this article came from attending the Udacity free foundation course “Introduction to Machine Learning on Azure” with emphases on Microsoft technologies implemented within Azure ML Studio ecosystem.