Architecting Your Healthcare Application for HIPAA Compliance,
In our last blog entry, we gave some background on the current state of healthcare IT in the United States and presented some examples of startups running on AWS that are creating real change in how patients are receiving their healthcare. As a prerequisite, we recommended that you start your learning about HIPAA architectures on AWS by taking a look at the following articles:
- Architecting for HIPAA Security and Compliance whitepaper
- AWS HIPAA Compliance FAQs for additional details on topics like the Business Associates Agreement (BAA) that AWS offers as well as which services can be used for storing and processing PHI
There is also a video from last year’s AWS re:Invent conference on Architecting for HIPAA Compliance on AWS that is worth a look. Be on the lookout for an upcoming 2015 update to this video.
AWS provides multiple services to deploy a highly available, scalable, secure application stack, which can serve a limitless variety of healthcare applications and use cases. In this blog, we will embark on a journey into HIPAA-eligible architectures by scoping the discussion to the following deployment diagram, which can be adopted as a starting point for building a HIPAA-eligible, web-facing application.
The underlying theme to this architecture is encryption everywhere! To understand it a little better, let’s dissect individual layers in the following sections.
0) Determine if Protected Health Information is Absolutely Necessary
As a preliminary step for building your architecture, you should stop and evaluate if Protected Health Information (PHI) is absolutely necessary for your healthcare application. In many use cases, especially those in analytics or research, often the same results can be obtained with de-identified health information.
The HIPAA Privacy Rule allows for certain types of disclosures of de-identified health information, and use of de-identified data, which is no longer considered PHI, does not require a BAA with AWS.
De-identified health information is defined by Health and Human Services (HHS) as information that “neither identifies nor provides a reasonable basis to identify an individual.” For more information, see the HHS site.
1) Obtain a Business Associate Agreement with AWS
Once you have determined that storing, processing, or transmitting protected health information (PHI) is absolutely necessary, before moving any of this data to AWS infrastructure you must contact AWS and make sure you have all the necessary contracts and a Business Associate Agreement (BAA) in place. These contracts will serve to clarify and limit, as appropriate, the permissible uses and disclosures of protected health information.
2) Authentication and Authorization
The authentication and authorization mechanisms you define for your HIPAA-eligible system must be documented as part of a System Security Plan (SSP) with all roles and responsibilities documented in detail along with a configuration control process that specifies initiation, approval, change, and acceptance processes for all change requests. Although the details of defining these processes won’t be discussed here, the AWS Identity and Access Management (AWS IAM) service does offer the granular policies required for achieving the necessary controls under HIPAA and HITECH.
As you develop your authentication and authorization procedures, be sure to review and follow the IAM Best Practices. The one rule worth repeating here is enable multi-factor authentication (MFA) on your AWS root account and lock away the access keys (or better yet don’t create a key for your root account at all). You should also be using MFA on any IAM account that has significant privileges in your AWS account.
Keep in mind that IAM is primarily geared towards the authentication and authorization of your AWS resources and the underlying infrastructure. You will most likely need to establish additional controls for authentication and authorization of your healthcare application.
To help achieve this, you might also want to consider using federation to extend the security controls established in your existing OAuth, SAML 2.0, or Active Directory infrastructure.
3) Web and Application Layers
DNS resolution is relatively straightforward and can be achieved using Amazon Route 53. Just be sure not to use any PHI in the URLs.
Amazon Elastic Load Balancer Configuration
The primary entity that receives the request from Amazon Route 53 is an Internet-facing Elastic Load Balancer. There are multiple ways in which an ELB load balancer can be configured, as explained here. To protect the confidential PHI data, you must enable secure communication options only, like HTTPS-based or TCP/SSL-based end-to-end communication. Although you can use TCP/SSL pass-through mode on the ELB load balancer for your web tier requests, using this option limits the use of some of the HTTP/HTTPS specific features like sticky sessions and X-Forward-For headers. For this reason, many startups prefer to make use of HTTPS-based communication on ELB, as shown in the following screenshot.
As shown in the configuration, there’s a single listener configured that accepts HTTPS requests on port 443 and sends requests to back-end instances using HTTPS on port 443. Because HTTPS is used for the front-end connection, you must create the certificate as per your publicly accessible domain name, get the certificate signed by a CA (for an internal load balancer you can use a self-signed certificate as well), and then upload the certificate using AWS IAM, which manages your SSL certificates, as explained in the ELB documentation. This certificate is then utilized to decrypt the HTTPS-based encrypted requests that are received by the ELB load balancer.
To route the requests from the ELB load balancer to the back-end instances, you must use back-end server authentication so that the communication is encrypted throughout. You can enable this by creating a public key policy that uses a public key for authentication. You use this public key policy to create a back-end server authentication policy. Finally, you enable the back-end server authentication by setting the back-end server authentication policy with the back-end server port, which in this case would be 443 for an HTTPS protocol. For an example of how to set this up easily using OpenSSL, check out the ELB documentation and Apache Tomcat’s documentation on certificates.
Many of our customers make use of an extra layer of security (like web application firewalls and intrusion detection/prevention solutions) in front of their web layer to avoid any potential malicious attacks to their sensitive applications. There are multiple options available in the AWS Marketplace to provision tools like WAF/IDS/IPS, etc. So you could start from there instead of setting it up from scratch on an EC2 instance.
The next layer is the web tier, which could be auto-scaled for high availability and placed behind an internal ELB load balancer with only a HTTPS listener configured. To further secure the access to web servers, you should open up your web server instances’ security group to accept requests only from the designated load balancer, as shown in the following diagram.
Encryption of traffic between the web layer and app layer will look similar to the setup in the preceding diagram. Again, there will be an internal ELB load balancer with HTTPS listener configured. On the application servers, SSL certificates are set up to keep the communication channel encrypted end-to-end.
Both the app and web layers should also be in private subnets with auto-scaling enabled to ensure a highly responsive and stable healthcare application.
4) Database Layer
The easiest way to get started with database encryption is to make use of Amazon RDS (MySQL or Oracle engine). To protect your sensitive PHI data, you should consider the following best practices for Amazon RDS:
- You should have access to the database enabled only from the application tier (using appropriate security group/NACL rules).
- Any data that has the potential to contain PHI should always be encrypted by enabling the encryption option for your Amazon RDS DB instance, as shown in the following screenshot. Data that is encrypted at rest includes the underlying storage for a DB instance, its automated backups, read replicas, and snapshots.
- For encryption of data in-transit, MySQL provides a mechanism to communicate with the DB instance over an SSL channel, as described here. Likewise, for Oracle RDS you can configure Oracle Native Network Encryption to encrypt the data as it moves to and from a DB instance.
- For encryption of data at rest, you could also make use of Oracle’s Transparent Data Encryption (TDE) by setting the appropriate parameter in the Options Group associated with the RDS instance. With this, you can enable both TDE tablespace encryption (encrypts entire application tables) and TDE column encryption (encrypts individual data elements that contain sensitive data) to protect your PHI data. You could also store the Amazon RDS Oracle TDE Keys by leveraging AWS CloudHSM, a service that provides dedicated Hardware Security Module (HSM) appliances within the AWS cloud. More details on this integration are available here.
For additional discussion on Amazon RDS encryption mechanisms, please refer back to the whitepaper.
To protect your patient data, you should be vigilant about your backup and restore processes. Most AWS services have mechanisms in place to perform backup so that you can revert to a last known stable state if any changes need to be backed out. For example, features like EC2 AMI creation or snapshotting (as in the Amazon EBS, Amazon RDS, and Amazon Redshift services) should be able to meet the majority of backup requirements.
You can also make use of third-party backup tools, which integrate with Amazon S3 and Amazon Glacier to manage secure, scalable, and durable copies of your data. When using Amazon S3, you have multiple ways to encrypt your data at rest and can leverage both client-side encryption and server-side encryption mechanisms. Details on these options are available in the Amazon S3 documentation.
PHI in S3 buckets should always be encrypted. You can also enforce the server-side encryption (SSE) option on any of the buckets by adding the following condition to your Amazon S3 bucket policy:
For security of data in transit, you should always use Secure Sockets Layer (SSL) enabled endpoints for all the services, including Amazon S3 for backups. If you are enabling backup of your data from the EC2 instances in a VPC to Amazon S3, then you could also make use of VPC endpoints for Amazon S3. This feature creates a private connection between your private VPC and Amazon S3 without requiring access over the Internet or a NAT/proxy device.
6) EC2 and EBS requirements
Amazon EC2 is a scalable, user-configurable compute service that supports multiple methods for encrypting data at rest, ranging from application-level or field-level encryption of PHI as it is processed, to transparent data-encryption features of commercial databases, to the use of third-party tools. For a more complete discussion of the options, see the whitepaper.
In the next example, we show you a simple approach to architecting HIPAA-eligible web servers.
First, you must be sure that your EC2 instance is running on hardware that is dedicated to a single customer by using a dedicated instance. You can do this by setting the tenancy attribute to “dedicated” on either the Amazon VPC that the instance is launched in, the Auto-Scaling Launch Configuration, or on the instance itself, as shown in the following screenshot.
Because Amazon Elastic Block Store (Amazon EBS) storage encryption is consistent with HIPAA guidance at the time of this blog writing, the easiest way to fulfill the at-rest encryption requirement is to choose an EC2 instance type that supports Amazon EBS encryption, and then add the encrypted EBS volume to your instance. (See the EBS link for a list of instance types.)
You should keep all of your sensitive PHI data on the encrypted EBS volumes, and be sure never to place PHI on the unencrypted root volume.
You might want to take some additional precautions to ensure that the unencrypted volume does not get used for PHI. For example, you can consider a partner solution from the AWS Marketplace, which offers full-drive encryption to help you feel more at ease. This will help to ensure that if there ever is a program (such as a TCP core dump) that uses the root drive as temporary storage or scratch space without your knowledge, it will be encrypted. Other startups have developed their own techniques for securing the root volume by using Logical Volume Management (LVM) to repartition the volume into encrypted segments and to make other portions read-only.
7) Key Management
At every turn in this architecture, we have mentioned encryption. Ensuring end-to-end encryption of our PHI is an essential component of keeping our data secure. Encryption in flight protects you from eavesdroppers, and encryption at rest defends against hackers of the physical devices. However, at some point we do need to open this ciphertext PHI in order to use it in our application. This is where key management becomes a “key” piece of the implementation (pun intended).
AWS does not place limitations on how you choose to store or manage your keys. Essentially, there are four general approaches to key management on AWS:
- Do it yourself
- Partner solutions
- AWS CloudHSM
- AWS KMS
A full discussion (or even a good starting discussion) on key management far exceeds what we can provide in a single blog entry, so we will just provide some general advice about key management as it relates to HIPAA.
The first piece of advice is that you should strongly consider the built-in AWS option. All of the checkbox encryption methods — such as Amazon S3 server-side encryption, Amazon EBS encrypted volumes, Amazon Redshift encryption, and Amazon RDS encryption make it very easy to keep your PHI encrypted and you should explore these options to see if these tools meet your BAA requirements and HHS guidance. These methods automate or abstract many of the tasks necessary for good key maintenance such as multifactor encryption and regular key rotation. AWS handles the heavy lifting and ensures that your encryption methods are using one of the strongest block ciphers available.
If you need to create a separation of duties between staff that maintain the keys vs. developers who work with the keys, or if you would simply like additional control of your keys and want to be able to easily create, control, rotate and use your encryption keys then you should look at using the Amazon Key Management Service (KMS). This service is still integrated with AWS SDKs and other AWS services like AWS CloudTrail, which can help provide auditable logs to help meet your HIPAA compliance requirements.
If you need additional controls beyond what is provided by AWS, you should be sure that you have proper security experts who can ensure the safe management of your encryption keys. Remember, a lost key could render your entire dataset useless, and AWS Support will not have any way to help a problematic situation.
8) Logging and Monitoring
Logging and monitoring of system access will play a starring role in your HIPAA-eligible architecture. The goal is to put auditing in place to allow security analysts to examine detailed activity logs or reports to see who had access, IP address entry, what data was accessed, etc. The data should be tracked, logged, and stored in a central location for extended periods of time in case of an audit.
At the AWS account level, be sure to launch AWS CloudTrail and immediately start recording all AWS API calls. You should also launch AWS Config, which will provide you with an AWS resource inventory, configuration history, and configuration change notifications.
You will also need to monitor and maintain the logs of your AWS resources for keeping a record of system access to PHI as well as running analytics that could serve as part of your HIPAA Security Risk Assessment. One way to do this is with AWS CloudWatch, a monitoring service that you can use to collect server logs from your EC2 instances as well as logs from the Amazon RDS DB instance, Amazon EBS volumes, and the ELB elastic load balancer. You can even develop custom metrics to obtain the necessary log information from your own applications.
CloudWatch has other useful features:
- View graphs and statistics on the console
- Set up alarms to automatically notify you of abnormal system behavior
- Capture network traffic in a single repository through the integration of CloudWatch with VPC Flow Logs
With all these logging mechanisms, you want to be sure that no PHI is actually stored in the logs. This usually requires some special attention. For example, sometimes you might need to encrypt PHI in your custom metric before sending to AWS CloudTrail. You also should be aware of everything that is coming into the logs. For example, the combination of session user and IP address coming from the ELB logs is considered PHI in some situations, so you should catch these special circumstances to be sure PHI is fully scrubbed from the logs.
Finally, Amazon S3 is a fantastic repository for all these logs. However, take extra precautions to lock down the permissions for log access of these highly sensitive data sets. You might want to consider some more stringent access requirements such as requiring multi-factor authentication to read the logs, turning on versioning to retain any logs that get deleted, or even setting up cross-region replication to keep a second copy of the logs in an entirely different AWS account.
The Starting Line
Unlike many blog posts that end with a nice wrap up of what you have achieved, a HIPAA-eligible architecture is really just a starting point. Maintaining proper HIPAA compliance is an ongoing job that is never really finished. We hope that this post has shed more light on the subject and perhaps given you the inspiration and knowledge needed so you can use the tools provided by AWS to research and design your own HIPAA architecture. Major medical centers are currently undergoing the critical tasks of upgrading their legacy software and computer systems to cloud software, and we hope your application on AWS infrastructure can be among the new generation of healthcare IT.
This post represents knowledge compiled from many technically savvy architects as well as several security and compliance experts here at AWS. There are far too many to mention here, but we would in particular like to thank Dave Veith, Bill Shinn, and Brandon Yost for their direct contributions to the material in this post.
Christopher Crosbie MPH, MS
Healthcare and Life Science Solutions Architect
Partner Solutions Architect