Is Weather Actually Changing ? — NOAA 100 years Weather Data — Time Series Analysis in Python

25GB Data, 100,790 files, Time Series Analysis & Forecasting, Hadoop, Spark, Matplotlib, Pandas, Machine Learning

Mohit Singh
datascape

--

Visualization to show the trends of Temperature and Precipitation in US since 1900 as per the seasons

Used the climate data from the database at ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/ which has daily data files.

The data consists of records of monthly data collected at several weather stations. Each record is broken down into days of the month so global climate data is available for each day of every month since 1900 which includes information about high and low temperature, precipitation broken down into types and depth, humidity, cloudiness, wind and more.

Main Python Libraries Used — Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, SciPy, Plotly

Main Targets:

Combine data from various weather stations by uploading into Hadoop Cluster, using MapReduce and converting to Parquet files for easy processing

Identify Temperature change around the world since 1900 and visualize on World Map to identify the countries affected the most.

Create time series for temperature and Bubble plot for precipitation in 100 years according to the seasons…

--

--