Data Architecture on cloud with HIPAA Compliance

Published in

Epsilon Engineering Blog

3 min readFeb 13, 2024

By Madhusudhana G K

People working in healthcare would understand the essence of HIPAA compliance. It’s one of the most prominent regulatory compliances in today’s healthcare industry. Enterprises are very cautious on their IT landscape to meet the compliance and deal with enabling the new business model while dealing with their consumer data. This will challenge the IT services to define the solutions to meet the current and future scalability needs, consuming the historic data, defining the new business models, and safeguarding the data governance and data security.

Architecting and designing a large-scale healthcare application with HIPAA compliance is really a complex one I have seen in the recent times. For each service and piece of data you look at in the business models, we must be mindful of how we receive and orchestrate the data or define the design with those security compliances in terms of taking the secured inbound data, persist/store, secure and export back (Outbound) to the business and consumers. Lot of those check and balances needs to be taken care around architecture, design, migration, and service components.

In a typical problem statement, source data comes from various systems (External, Interna and 3rd Party) across various data-formats. Data would be of structured, semi-structured and unstructured. Most of the times, the source data have associated relationships among multiple source systems and external systems. Source data we would be getting in various formats like CSV, Excel, Flat-files, DB dumps, log files, clous storage file etc. Complexity of Data size varies from few KB to 500+ GB of file data. Source data get into-out of the system by ftp, sharing data file to temp data store, pushing data files (Zip/Individual) etc.

In the HIPAA principles, base rule in handling such data is to encrypt during transit and at rest. Decryption is allowed only via customer shared private keys. Securing the data applied at various architecture layers like Infra, users’ access, end points, load balancers, ETL process, Firewall, NW, physical data storage, Data volumes, Data backup volumes, DR.

We have seen, architects used to define such complex on-Prem architectures which are truly scalable and still few enterprises use this solution today. In our experience, this requires quite enormous implementation, efforts, increases infra design and maintenance cost, adoptability for the future scalability is getting challenged. On the contrary, today’s cloud solutions offer much robust, scalable, and future adaptable solutions.

Following is some of the key solution considerations for HIPAA based data architectures. But each of these sub-topics will lead to larger architectural and design considerations.

· Cloud services (E.g.: AWS — Redshift, Glue, Databrew, S3, EC2, Firewall, Security groups, Ports, IDC, Lamda, KMS …)

· Cloud Identity and Access management (Roles and Permissions)

· Role based data access around the Datawarehouse (schemas and tables)

· Cloud Account security, Safeguarding security incidence

· File input and output (Encryption and Decryption services)

· Data security (at-rest and in-Motion)

· Data persistence and data transfer (Guarding PII and Non-PII data and access restrictions)

· Secured Data Migration (Source vs. Target cloud solutions)

· Data backup and retention policies

· Data encryption keys (Default vs. Customer provided)

· Data file integrity

· Data enclave services

· Infra systems and services vulnerability scan, report, and fix

· Managing security keys (Vaults)

· Securing log services

· Source code security (ETL, Migration, Custom design)

Data Architecture on cloud with HIPAA Compliance

Written by Epsilon India