AIML: Data Governance

Chao Zhang
2 min readOct 5, 2023

--

Data Strategy in the Age of AI

Data define the upper bound of AI ML model performance.

Data governance is the set of policies, standards, and practices that ensure the availability, accuracy, security, and usability of data.

Data quality

Data quality affects the validity, reliability, and effectiveness of the data analysis and insights. This reminds us to make sure we validate the data in the following ways:

  • accurate
  • complete
  • consistent
  • relevant
  • timely.

Data security

Data security can help data scientists to safeguard their data sources, platforms, and applications, as well as to comply with data policies, standards, and regulations.

Data ethics

Data ethics goes beyond the legal and regulatory compliance of data governance and addresses the moral and social implications of data science.

Data ethics can help data scientists to ensure that their data practices are fair, transparent, responsible, and beneficial for the stakeholders involved, as well as to avoid or mitigate potential harms, biases, or risks

Data democratization

Making data accessible and understandable to a wider range of users, not just data experts

Data privacy

User data is crucial, as it provides a valuable resource for model training and supervised fine tuning, enabling better understanding and generation of better system responses in future. Additionally, it allows us to analyze the past interactions and user behaviors.

However we have to follow HIPAA, FCRA, FERPA, GLBA, ECPA, COPPA, VPPA. These laws create rules for specific industries and institutions. They protect against the misuse of certain categories of personal information.

--

--