Labeling 2.0: Data-Driven Document Labeling Automation
Protecting sensitive data is a defining imperative for information security programs. However, you can only protect what you can see, and you can only protect found evidence appropriately if you can understand the data.
The stakes for data protection are getting higher — there’s more of it, in more places, for more business reasons and at higher risk from motivated attackers. And, mandates like the EU General Data Protection Regulation (GDPR) further complicate the picture by adding a privacy dimension to data protection strategies and programs.
The traditional model of putting enforcement points everywhere — the brawn in the equation — is the ‘how’ to protect the data. The question of ‘what’ data needs to be protected in order to balance data risk and reward at scale is where the brains element comes in.
Being able to make existing security and privacy enforcement models smarter is where BigID is taking our data intelligence next — using the BigID ‘brains’ to make the enforcement ‘brawns’ more sophisticated.
Data Intelligence: A Force Multiplier
Extracting the business value of data while ensuring security and protecting privacy amplifies the need to make data protection nimbler, smarter and more effectively automated.
This is why BigID has set out to:
• Package our data intelligence and privacy insights easy for enforcement tools to consume.
• Enable customers to seamlessly orchestrate their enforcement policies based on BigID’s data intelligence.
One example is providing data subject residency insight via labeling and tagging to address cross-border data flows that enforcement tools would otherwise be blind to.
A Modern Labeling Framework for Modern Data Stores
Data protection, like many other security toolsets, emerged in response to the threats created by new technologies.
• Broader adoption of structured data sources (and data breaches) catalyzed the emergence of DLP.
• The CASB category coalesced around content moving between the enterprise network and the SaaS provider.
• Unstructured data governance tools were developed in response to the proliferation of file shares.
The assumptions of data processing flows that held when these products were first introduced are a thing of the past.
A proactive data protection strategy entails not just coming to terms with expanding data volumes, but also a diversity of data sources — cloud, semi-structured along with structured and unstructured data sources.
BigID was built in anticipation of new data sources continually presenting themselves and designed both our core discovery and connectivity models on that basis. This translates into the ability to gain visibility and understanding across the data estate that other discovery tools cannot match.
Automating Labels & Tags Using Data Intelligence
In order to package our data intelligence for consumption, BigID has developed a set of policy labeling and attribute tagging for data objects for consumption by policy enforcement points:
• Information rights management vendors like Microsoft Azure Information Protection.
• Database audit and protection tools like IBM Guardium.
• Network DLP, cloud and email gateway enforcement points.
To extend the brain analogy, labeling and tagging is the connective tissue that allows for the brawn to make specific decisions while relying on a centralized and consolidated source of data intelligence.
BigID’s cross-platform intelligence enables customers to both optimize for the specific context and capabilities of the enforcement point — and apply consistent policies across the data estate.
Discovery and indexing findings on data residency, personal information attributes, and data risk are easily integrated through these labeling and tagging ‘artifacts’ and a set of supporting APIs to enable policy orchestration and automation of policy enforcement.
These artifacts can be used to ‘fingerprint’ data or extend classification schemas to both improve the accuracy of existing policies and extend enforcement to privacy use cases.
Labeling for Security & Privacy
Traditional approached discovery and classification have focused on finding where and how much highly identifiable data is stored to map PCI DSS or GLBA compliance requirements — with DLP especially requiring constant tuning and investment to maintain accuracy for a well-defined set of attributes.
By contrast, BigID is designed to discover all personal data through correlation and machine learning heuristics, generate insights into new data attributes that had no classification assigned and address the fundamental requirement for privacy protection — understanding data residency, data subject or new PI attributes.
However, when these insights are integrated into DLP enforcement policies, customers get to extend their existing investments and are able to take action based on privacy policies. Enforcement models can now integrate important context for privacy protection.
Security and privacy needs are converging. Being able to integrate brains and brawn into a tightly orchestrated model allows customers to extend their ability to take action based on better understanding.