FHIR Data Ingestion using GCP’s Cloud Healthcare API(Part -1)

Sudharma Mokashi
Google Cloud - Community
6 min readDec 26, 2022

In this article we would be exploring how FHIR data ingestion and analytics can be performed on GCP using Healthcare API and BigQuery.

Part 1 will cover how FHIR data can be ingested into FHIR store on GCP and analysed using BigQuery’s analytics Capabilities.

Part 2 will cover how we can perform data de-identification operation on FHIR.

Part 3 will cover how DICOM datasets can be ingested into GCP and de-identified later.

For those who are new to FHIR(Fast Healthcare Interoperability Resources), it is a data standard which is followed by Healthcare industries to ease the interoperability of healthcare data. Learn more about FHIR.

[Looking for more information about FHIR? Read Care Gaps on FHIR to improve quality of patient care.]

Topics which would be covered in this article are as follows :

  • Cloud Healthcare API
  • Ingestion of FHIR data into GCPs FHIR store
  • Export FHIR data to BigQuery

What is Cloud Healthcare API ?

Healthcare API is a mediator between the healthcare systems and applications built on Google Cloud Platform. Using healthcare API one can connect their data to advanced Google Cloud Capabilities, including streaming data processing with Cloud Dataflow, scalable analytics with BigQuery, and machine learning with Cloud Machine Learning Engine.

FHIR Data ingestion into GCP using Healthcare API

Google Cloud provides detailed guidance regarding how it supports compliance with HIPAA in the US, the PIPEDA in Canada, and other global privacy standards at cloud.google.com/security/compliance.

The Cloud Healthcare API treats data location as a core component of the API. You have the option to select the storage location for each dataset from a list of currently available locations which correspond to distinct geographic areas aligned with Google Cloud’s regional structure.

Let’s deep dive into a quick demo which will cover FHIR data ingestion into Google Cloud Platform using Cloud Healthcare API. For this demo we are using FHIR data which is stored on GCS bucket in .ndjson(newline — delimited) format.

Steps to ingest FHIR data into GCP

  1. Enable the Healthcare API — To Enable the Healthcare API, follow the link and click Enable.

2. Create a BigQuery dataset to store the exported data for analysis. Go to the BigQuery Console and create the dataset.

BigQuery dataset creation

3. Execute the below mentioned commands on cloud shell to attach the right IAM roles to healthcare api’s service account.

export PROJECT_ID=$(gcloud config list --format 'value(core.project)')
export PROJECT_NUMBER=$(gcloud projects list --filter=projectId:$PROJECT_ID \
--format="value(projectNumber)")
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member=serviceAccount:service-$PROJECT_NUMBER@gcp-sa-healthcare.iam.gserviceaccount.com \
--role=roles/bigquery.dataEditor
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member=serviceAccount:service-$PROJECT_NUMBER@gcp-sa-healthcare.iam.gserviceaccount.com \
--role=roles/bigquery.jobUser

4. Create healthcare dataset by searching Healthcare in the navigation menu and click on Create Dataset.

Navigate to Healthcare section

set the below mentioned properties for healthcare dataset

set the healthcare dataset properties

5. It will take sometime for the dataset creation, once it is created you would be able to see the dataset in the healthcare browser as shown below

Healthcare Browser

6. Click on the recently created dataset to create a FHIR store.

Create Data Store

set the below mentioned properties for data store

Data Store Type and ID

select the FHIR store version and click next

select FHIR store version

make no changes in BigQuery stream section and move forward to Receive Cloud Pub/Sub notifications section, select Create A Topic option.

Create pub/sub topic for receiving notifications

provide topic id, encryption mechanism and click on create topic

Create the topic required

Click on create to create a FHIR data store.

Create FHIR data store

7. Import FHIR data from Google Cloud Storage bucket to the recently created FHIR data store.

Import data

8. Before importing the dataset, provide object viewer permission to healthcare service account by running the following command in cloud shell

gcloud projects add-iam-policy-binding $PROJECT_ID --member=serviceAccount:service-$PROJECT_NUMBER@gcp-sa-healthcare.iam.gserviceaccount.com --role=roles/storage.objectViewer

9. Now select the appropriate project,GCS location and Content Structure and create import.

Import FHIR data

10. Go to operations to check the logs and process.

Check the logs in operations

here you can see all the resources were successfully imported

Import Logs

11. Select Open in FHIR viewer from the data store window to check the recently imported data

Open in FHIR Viewer

12. As part of this demo, we have inserted sample patient resource data, hence search for the patient resource.

search for patient resource

13. Patient FHIR data would be available on the FHIR store explorer.

FHIR store explorer

14. Apart from this you can select individual record and see its data in form of elements as well as raw json on the right hand side (inside FHIR viewer as shown in above mentioned step).

  • Elements
Elements of resource
  • Json
Raw Json

15. Export the data to BigQuery for analysis using export option on Data Store page

Select Export Option

provide export options and click export.

Export FHIR resource data to BigQuery

16. Similar to the import operation, if we want to check the logs of this operation we can check the operations window and select the required operation(in this case export one).

Dataset Export operation successfully completed

As we have exported the Patient resource data, it will create a table with name Patient in specified BigQuery dataset.

Data Exported to BigQuery

That’s it! You’re ready to perform the analysis in BigQuery. Thanks for reading the blog, see you next time!

References

--

--