We are in the midst of a tectonic shift to the cloud. An estimated 69% of enterprises are in the process of moving mission critical applications and data to the public cloud and that percentage is likely to grow. Cloud provides the flexibility for companies to expand and contract infrastructure and compute as business demand ebbs and flows.

But enterprises have their concerns, as they should. Specifically, the same survey cited above also found that the top concerns among enterprises moving to the cloud are moving sensitive data (65%), security (59%), and compliance challenges (54%).

AWS continues to be the dominant player in the public cloud. Forbes estimated AWS to have a market share close to 50% of the $32 billion public cloud market share. Since launching in 2006, AWS has grown from offering infrastructure and storage as a service to providing full-featured services including data management and analytics. …

So, we have yet another breach on our hands. A Seattle women, Paige Thompson, was charged by the FBI last week with intentionally stealing data from Capital One. The data includes a large number of credit card applications, with information from about 106 million customers. By contrast, the Equifax breach exposed personal information of 147 million people.

What happened?

We still don’t have a complete picture of just how Thompson managed to breach Capital One’s data, but the FBI filing and dozens of blogs provide some information. We know that Thompson was able to gain access to servers in Amazon’s public cloud, AWS, being used by Capital One. She apparently took advantage of a misconfiguration in a web application firewall. Once inside, Thompson apparently was able to retrieve identity and access management (IAM) credentials and assume a role on the server which had broad permissions for certain S3 buckets containing the sensitive data. …

Image for post
Image for post

We are excited to be back at the DataWorks Summit being held this week in Barcelona, Spain. DataWorks Summit brings together practitioners from big data across Europe for a workshop and a 2-day conference focused on analytics, security, governance and updates from the community.

Dataworks conferences over the years have been a place where the community comes together to share best practices and learnings from using big data. It is great to hear a broad perspective of views in way only an open source community can work.

This year, our co-founder, Don Bosco Durai, is presenting a talk on leveraging Apache Range to solve security and governance challenges across hybrid environments.

If you use big data in your company and you are contemplating the cloud, this session could be useful for you. Feel free to reach out to us if you would like to meet during the conference.

Image for post
Image for post

When I co-founded XA Secure in 2012 with Don Bosco Durai and Selvamohan Neethiraj, we had a vision to bring centralized security to the Hadoop ecosystem, which consisted of many open source projects with little or no security. XA Secure was acquired by Hortonworks in 2014 and the product was introduced to open source as Apache Ranger.

Fast forward to today, Apache Ranger is a thriving project with over million lines of code, 26 committers and close to 50 contributors. Apache Ranger is used by a large number of enterprises globally to secure their big data deployments.

Privacera was founded with the vision to enable companies to secure and govern their data layer across any database, platform, on-premises or in the cloud. A core part of enabling security is ensuring data is protected at rest and users have access to data only when they need to. Apache Ranger is a core part of the Privacera story and we are excited to play a small part in enhancing the work the Apache Ranger community has already done. …

Image for post
Image for post
Photo by ev on Unsplash

It has been a while since we posted a blog in this series. This is the last part of the managing risks in big data series. In the previous blog, we covered how companies can anonymize data during ingest. Once access control and anonymization methods are enabled, companies need to build tools and process to get visibility into what users are doing with their data.

Why monitoring?

Consider these scenarios which are typical in big data.

Let us consider a bank with fraud analytics leveraging data to find consumer fraud or criminal activities such as money laundering.The fraud team has typically access all the customer data. If the data is anonymized, then fraud team use special functions to de-anonymize the data, essentially getting the actual customer data and using that for their analytics. The challenges for the compliance team becomes tracking what the internal user do with the customer data once they have access to it.As …

Image for post
Image for post

Dataworks Summit, San Jose 2018 is coming up next week. It is a premier event for companies using big data and vendors in this space. Being a local event for Privacera US team members, we are looking forward spending quality time at the conference rather than worrying about travel.

Looking at the agenda one can infer that the big data industry is maturing and moving to a next set of challenges around analytics, AI/Machine learning and operating in hybrid data environments.

Privacera team members would be speaking at the event as well. Our CTO, Don Bosco Durai, would be speaking on the topic of leveraging Apache Ranger to protect data in hybrid environments. He would be covering the work we have done in the Apache Ranger community to extend Ranger to support AWS S3. The model is extensible and we are looking to extend Apache Ranger to other non Hadoop data…

Image for post
Image for post

We are excited to be back at the DataWorks summit being held in Berlin between April 16–19,2018. Dataworks Summit brings together practioners from big data across Europe for 2 day workshop and a 2 day conference focused on analytics, security, governance and updates from the community. I have always found it enriching to hear real life case studies on how companies across industries are leveraging the big data to bring business value, their challenges along the way and learnings from them. Technology is not always the only variable in big data journey, people and process play an equal part. …

Image for post
Image for post

In the last blog in this series, we covered the how teams can control sensitive data through fine-grained access control policies. Beyond access control, one of the trusted methods to reduce sensitive data exposure is anonymization.

Data anonymization is the process of removing personally identifiable information from data. There are different methods for anonymization adopted over the years. For simplicity, we are going to cover the 2 main methods used in the industry currently.

  • Encryption — Regulations such as HIPAA, PCI suggest data be encrypted using a key and an algorithm. Encryption is simply a process of converting data into another form indistinguishable from original data. …

Image for post
Image for post

We are excited to co-host a webinar on managing sensitive data within big data environments, along with our partner Hortonworks. The webinar will cover 4 essential steps built from experiences working with best of breed companies running data lakes.

The webinar would take place at 11a pacific time on January 24th. You can register using this link.

We hope you can join us for an interactive discussion. If you have any specific questions that you would want to address during the webinar, do send us a note or leave a comment here.


Balaji Ganesan

Co-Founder, CEO at Privacera

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store