Contactless Health Insurance Cards: An Approach to Safeguard Healthcare Users and Frontline Admin

Alex Ianovski
Analytics Vidhya
Published in
6 min readApr 29, 2020

Contactless payment, a technology that facilitates financial transactions with the use of radio-frequency identification (RFID) or near field communication (NFC), was first introduced to Android devices in 2011 and iOS devices in 2014. As customers discovered the ease and efficiency of the payment method, adoption of the technology steadily grew. In March 2020, contactless payments reached an all-time utilization peak. Shoppers responded to the COVID-19 outbreak by purchasing in bulk, meanwhile, retailers implemented new measures, including limiting cash-accepting lanes and enabling additional contactless payment machines to safeguard customers and essential frontline staff from the spread of infectious disease.

Throughout Canadian healthcare institutions, a lack of NFC-enabled or digital health insurance identification means that many frontline administrative staff must currently record patient information manually. This increased interaction between frontline staff and patients poses additional health risks to both parties. In this article, I describe and demonstrate a tool that can be used to enable contactless scanning of Ontario Health Insurance Plan (OHIP) cards, introducing additional administrative efficiency and safety measures. Through the use of Amazon Textract, a service which automatically detects and extracts text and data from scanned documents, the tool can decrease interaction between frontline health care adminstators and patients, thus reducing unnecessary exposure and health risks for both parties.

Uploading and Analyzing Image

To start, a sample OHIP card was downloaded from an Ontario government website and uploaded to the Amazon S3 cloud storage service.

Figure 1 | Overview of an OHIP card in Amazon S3

Next, the image was uploaded from S3 to Amazon Textract. Textract simplifies optical character recognition (OCR) and enables developers to launch accurate data extraction applications with ease. Textract analyzed the image for text, forms, and tables, and defined each object as a Block. Users can familiarize themselves with the Textract service by manually uploading documents to the online console. As illustrated in Figure 2 below, the background of the image is greyed out and each Block is highlighted in the foreground.

Figure 2 | Demonstration of Amazon Textract’s optical character recognition (OCR) on an OHIP card

Textract’s API returned more details than the demo tool shown above. Apart from identifying objects such as words and tables in an image, the API calls returned some of the following information:

  • Type of block obtained (e.g. key value set, page, line, word, table, cell)
  • Confidence level (0–100)
  • Identified text
  • Table information (row index, column index, row span, column span)
  • Object coordinates in image
  • Block dimensions

Once the image was retrieved from the S3 bucket, it was converted to binary and passed to the Textract client. Identified words with a confidence level greater than 70% were then appended to an array to form a string of text.

def get_text_analysis(self,bucket,document):
# Get the document from S3
print("Importing Image..........")
s3_object = self.s3_resource.Object(bucket,document)
s3_response = s3_object.get()
stream = io.BytesIO(s3_response['Body'].read())
image=Image.open(stream)
image_binary = stream.getvalue()
print('Extracting words..........')
response = self.textract_client.analyze_document(Document=
{'Bytes': image_binary},FeatureTypes=["TABLES", "FORMS"])

# Get the text blocks
blocks=response['Blocks']
print ('Detected Document Text')

# Filter words with confidence level > 70%
top_words = self.filter_top_words(blocks)
return top_words
# Filter word blocks with confidence level > 70%
def filter_top_words(self,blocks):
print("Filtering top words...........")
top_words = []
for block in blocks:
if 'Confidence' in block:
print("[DEBUG] confidence = {} \ntext = {}".format(block['Confidence'],block['Text']))
if(block['Confidence']> 70 and block['Text'] not in top_words): # Remove repeated words
top_words.append(block['Text'])
return top_words

After filtering out low-confidence objects, such as the signature field and very small text, the combined output text read:

"ntarto Healt Sante ANITA JEAN WALKER 5584 - 486 - 674 674-YM - YM BORN/NE(E 1981 12-15 YRIAN MOMM DAU ag wathee ISS/DEL EXP/EXP 2012-12 - 12 - 15 20170.12-.15 201.7 
- 12-.15 YRAN MOM DA MOM DAA Healt Sante ANITA JEAN WALKER 5584 - 486 674 674-YM YM 1981 12-15 MOMM DAU ag wathee 2012-12 12 15 20170.12-.15 201.7 12-.15 MOM DA"

The next step required parsing the text to extract the patient’s information. Regular expressions based on expected fields such as “BORN”, referring to the birth date, and “EXP”, referring to the expiry date, were used:

def get_ohip_dict(self,text):
name = re.compile('(?<=[Health] Sante )(.*?)(?= \d)').findall(text)[0]
ohip_num = re.compile("(\d{4}(( |-){1,3}\d{3})( |-){1,3}\d{3})").findall(text)[0][0]
ohip_chars = re.compile("[\D]{2}(?= BORN)").findall(text)[0]
ohip = self.ohip_builder(ohip_num,ohip_chars)
birth_year = int(re.compile("(?<=BORN\/NE\(E )\d{4}").findall(text)[0])
birth_month = int(re.compile("(?<=BORN\/NE\(E \d{4}[ -])\d{2}").findall(text)[0])
birth_day = int(re.compile("(?<=BORN\/NE\(E \d{4}[ -]\d{2}[ -])\d{2}").findall(text)[0])
birthdate = datetime(birth_year,birth_month,birth_day)
issue_year = int(re.compile("(?<=EXP\/EXP )\d{4}").findall(text)[0])
issue_month = int(re.compile("(?<=EXP\/EXP \d{4}-)\d{2}").findall(text)[0])
issue_day = int(re.compile("(?<=EXP\/EXP \d{4}-\d{2}[ -]{3})\d{2}").findall(text)[0])
issuedate = datetime(issue_year,issue_month,issue_day)
expdate = datetime(issue_year+5,birth_month,birth_day)
ohip_dict={
"name": name,
"ohip": ohip,
"birthdate": birthdate,
"issuedate": issuedate,
"expdate": expdate
}
return(ohip_dict)

Finally, the fields of interest were stored in a dictionary object. The object could to be later retrieved or uploaded to a database.

Table 1 | Resulting regions of interest captured from Health Insurance ID

|| Property | Value ||
|| name | 'ANITA JEAN WALKER' ||
|| ohip | '5584486674YM' ||
|| birthdate | 15/12/1981 ||
|| issuedate | 12/12/2012 ||
|| expdate | 15/12/2017 ||

As shown in Table 1 above, 98% of the characters of interest were extracted accurately. Accuracy of results could be further improved by measures such as increasing the sample size and capturing multiple images for each ID card, increasing the image resolution, and utilizing the Block coordinates to search for expected fields located in spatial regions on the identification card.

Mobile digital wallet applications could also be utilized to store health insurance identification cards and enable contactless identification. The province of Ontario has worked with Bluink to digitize and securely store Ontario driver’s licenses. Expanding this infrastructure to digitize OHIP could reduce the interaction between administrative staff and patients, reducing the risk of spreading infectious disease.

Conclusion

Limiting physical payment transactions between customers and essential frontline staff has been reported to reduce the risk of spreading infectious disease. Currently, health ministries do not deploy contactless health insurance cards, and patient information is recorded manually by frontline workers. This article demonstrates a tool for enabling contactless ID without modifying existing identification cards or implementing additional infrastructure such as chip readers. This solution can assist healthcare users and frontline administrative staff by reducing interaction time through automated digitization of health insurance information, thus mitigating risk of unnecessary exposure and contact.

As always, thank you for taking the time to read this material. Please share any comments and thoughts below; any and all feedback is appreciated. In addition, please feel free to reach out via LinkedIn.

Resources

References

--

--

Alex Ianovski
Analytics Vidhya

MASc Biomedical Engineering. BEng Engineering Physics. Documenting my journey in ML and Data Science.