Snowflake
Published in

Snowflake

Tracking the vaccination progress with Snowflake and Starschema

Starschema shares COVID-19 tracking data in the Snowflake Data Marketplace, including vaccination numbers from JHU and Our World in Data. Find here the basic SQL queries to use this data in your projects.

World ranking

2021–03–11: Few countries have vaccinated more than 4% of their population. Gibraltar and Israel display outstanding numbers at almost 50%. They are followed by Seychelles (27%), Bermuda, Cayman Islands, Bahrain, USA, Serbia, Malta, Faeroe Islands, and Chile (6.64%).

% vaccinated per country, source OurWorldInData through Starschema in Snowflake

What’s interesting about this query:

  • Source is Our World in Data, and Starschema loads and shares this data in Snowflake.
  • The owid_vaccinations table reports interesting statistics like # of daily vaccinations, % of population vaccinated, and the specific vaccines each country is using.
  • To get only the latest day of data we use the qualify row_number()=1 instruction. This is especially important in a chart like this because each country reports their numbers in different schedules.
  • Some countries are missing in the chart: This because not all countries report people_fully_vaccinated_per_hundred. For example, the UAE is one of the leading countries in number of people vaccinated, but since we don’t know how many are fully vaccinated, they don’t show up on this particular chart.
select *
from public.owid_vaccinations
qualify row_number() over(partition by iso3166_1 order by date desc)=1
order by people_fully_vaccinated_per_hundred desc nulls last

USA states ranking

2021–03–12: Looking at the chart it looks like Alaska has given the second dose to 17% of its population, while DC less than 9%.

% vaccinated per state, source JHU through Starschema in Snowflake

What’s interesting about this query:

  • Source is JHU, and Starschema loads and shares this data in Snowflake.
  • The jhu_vaccine table doesn’t report percentage of people vaccinated, but we can get state demographics out of the demographics table.
  • The query uses data from the people_total_2nd_dose column, but there will be a vaccine that doesn’t require a 2nd dose. For now the people_total_2nd_dose is a good representation of fully vaccinated. You can also use data from the columns doses_admin_total and people_total, but there’s no column explicitly accounting for “fully vaccinated”.
  • To get only the latest day of data we use the qualify row_number()=1 instruction.
select people_total_2nd_dose/population ratio_vaccinated, *
from starschema_covid19.public.jhu_vaccines a
join (
select state, sum(total_population) population
from starschema_covid19.public.demographics
group by 1
) b
on a.stabbr=b.state
qualify row_number() over(partition by province_state order by date desc)=1
order by date desc, ratio_vaccinated desc

Learn more

Want more?

I’m Felipe Hoffa, Data Cloud Advocate for Snowflake. Thanks for joining me on this adventure. You can follow me on Twitter, and check reddit.com/r/snowflake for the most interesting Snowflake news.

--

--

--

Snowflake articles from engineers using Snowflake to power their data.

Recommended from Medium

What is Python Used for? It’s 7 Practical Applications

Recapping the Data Engineering Conference SF

“How Would You Like Your Data?”

Why I Chose Data Science and My Journey So Far

Intuition behind Log-loss Score

How to Compute Football Implied Probabilities From Bookmakers Odds

Comparison of Different Darknet Marketplaces: Listings

How to ask questions you can instantly action

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Felipe Hoffa

Felipe Hoffa

Data Cloud Advocate at Snowflake ❄️. Originally from Chile, now in San Francisco and around the world. Previously at Google. Let’s talk data.

More from Medium

SQL scripting: Live in Snowflake

Snowflake: how to repartition staged data

Monetise Your Data From Wind Farms By Leveraging Snowflake Data Marketplace

Data Exchange In The Marketplace

Snowflake Micro-partitions, Data Clustering & Zero-copy Cloning