Ingesting All The Medications in America Every 7 Days
Apache NiFi to Read RSS REST Feeds the Smart Way!
“How to Access DailyMed Data via XML, JSON, RSS REST Feeds / HTTP InvokeHTTP GET over SSL”
DailyMed provides a lot of drug related data, so let’s ingest some of the most interesting.
First feed — RSS Daily
Input -> Get Set ID -> Get Lookup Details -> PublishKafkaRecord
Get Set ID -> Get Label Details -> Labels -> UpdateRecord -> PublishKafkaRecord -> RetryFlowFile
Get Label Details
UpdateRecord
Input -> Get Set ID -> Get Lookup Details -> UpdateRecord -> ExtractText -> Send Message Notification
Kafka Topics for DailyDrugNews
RSS Description
RSS Feed
https://dailymed.nlm.nih.gov/dailymed/rss.cfm
SupportingData
Label RSS
https://dailymed.nlm.nih.gov/dailymed/labelrss.cfm?setid=${setID}
Example Output
[
{"version":2.0,
"channel":
{"title":"DailyMed Drug Label Updates for TAZAROTENE CREAM [MAYNE PHARMA]",
"link":[
"https://dailymed.nlm.nih.gov/dailymed/lookup.cfm?setid=0296f0e9-f940-45d9-987e-1ee26a7ca961&version=2",
{"rel":"self",
"href":"https://dailymed.nlm.nih.gov/dailymed/labelrss.cfm?setid=0296f0e9-f940-45d9-987e-1ee26a7ca961",
"type":"application/rss+xml"}],
"description":"\n\tDailyMed provides high quality information about marketed drugs.\n\tDrug labeling on this Web site is the most recent submitted to the Food and Drug Administration (FDA)\n\tand currently in use; it may include strengthened warnings undergoing FDA review and minor editorial changes.\n ","language":"en-us","pubDate":"Thu, 30 Nov 2023 00:00:00 EST","lastBuildDate":"Fri, 08 Dec 2023 14:20:40 EST",
"item":
{"title":"TAZAROTENE cream [Mayne Pharma]",
"description":"Updated Date: Thu, 30 Nov 2023 00:00:00 EST",
"link":"https://dailymed.nlm.nih.gov/dailymed/lookup.cfm?setid=0296f0e9-f940-45d9-987e-1ee26a7ca961&version=2",
"pubDate":"Thu, 30 Nov 2023 00:00:00 EST",
"guid":
{"isPermaLink":true,"value":null}}},
"uuid":"fd6b929f-3832-4710-af0a-9fef9581ee79"}
]
HTML Page
Second flow, SPL.
Kafka Topics for DailyMedSPL
REST Ingest Meds
Read all the drug nails of the day in RSS.
Source:
https://dailymed.nlm.nih.gov/dailymed/services/v2/drugnames.json?pagesize=100
We need to grab all the fields to for navigation such as page, url, next, total elements and total pages.
Batch
Download all labels
Format for once a month (monYYYY)
https://dailymed-data.nlm.nih.gov/public-release-files/dm_spl_monthly_update_nov2023.zip
Data
Grab up to 100 records then iterate to pages
pagesize=100&page=13
- All APIs Web Services
https://dailymed.nlm.nih.gov/dailymed/app-support-web-services.cfm#restfulapi
UUNIS API
https://dailymed.nlm.nih.gov/dailymed/webservices-help/v2/uniis_api.cfm
https://dailymed.nlm.nih.gov/dailymed/services/v2/uniis.json
RXCUIS API
https://dailymed.nlm.nih.gov/dailymed/webservices-help/v2/rxcuis_api.cfm
https://dailymed.nlm.nih.gov/dailymed/services/v2/rxcuis.json?pagesize=100&page=2
Drug Names API
https://dailymed.nlm.nih.gov/dailymed/webservices-help/v2/drugnames_api.cfm
https://dailymed.nlm.nih.gov/dailymed/services/v2/drugnames.json
App #s API
https://dailymed.nlm.nih.gov/dailymed/webservices-help/v2/applicationnumbers_api.cfm
https://dailymed.nlm.nih.gov/dailymed/services/v2/applicationnumbers.json
https://dailymed.nlm.nih.gov/dailymed/services/v2/applicationnumbers.json?pagesize=100&page=13
Drug Classes API
https://dailymed.nlm.nih.gov/dailymed/webservices-help/v2/drugclasses_api.cfm
https://dailymed.nlm.nih.gov/dailymed/services/v2/drugclasses.json
SPLS API
https://dailymed.nlm.nih.gov/dailymed/webservices-help/v2/spls_api.cfm
https://dailymed.nlm.nih.gov/dailymed/services/v2/spls.json
NDCS API
https://dailymed.nlm.nih.gov/dailymed/webservices-help/v2/ndcs_api.cfm
https://dailymed.nlm.nih.gov/dailymed/services/v2/ndcs.json
Example Use Case
Download daily extracts from FTP and unzip.
Grab daily news from RSS to get what’s changed.
Use setid to get more data.
Also Grab SPL https://dailymed.nlm.nih.gov/dailymed/services/v2/spls/9256d3b2-50eb-4091-bbcd-1982865fb998.xml
Grab SPL Media https://dailymed.nlm.nih.gov/dailymed/services/v2/spls/9256d3b2-50eb-4091-bbcd-1982865fb998/media.json This will produce data with URL to jpegs or other mime_types, download these.
Get ndcs for it https://dailymed.nlm.nih.gov/dailymed/services/v2/spls/9256d3b2-50eb-4091-bbcd-1982865fb998/ndcs.json This one supports the next_page paradigm that we can use to navigate through many pages.
Get packaging for it
Get all spl version information
This one supports the next_page paradigm that we can use to navigate through many pages.