Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

Website Visitor Forecast with Facebook Prophet: A Complete Tutorial

7 min readMar 10, 2022

--

Website visitor forecast with Facebook Prophet all in one tutorial 2022 (with installation instructions, data, parameters tuning, and both simple & sophisticated forecasting) — Part 1
Forecasting may save you or your company. Photo by NOAA on Unsplash

1. Installation

Finish the installation of Anaconda by following the installation guide.
Finish the Anaconda installation by following the installation guide. Image by Author
We will use the Conda command prompt in this Prophet tutorial.
We will use the Anaconda command prompt in this Prophet tutorial. The circled above is the Anaconda Prompt. Image by Author
conda create -n python_3_8 python=3.8
conda activate python_3_8
conda deactivate
conda install libpython m2w64-toolchain -c msys2conda install numpy cython -c conda-forgeconda install matplotlib scipy pandas -c conda-forgeconda install pystan -c conda-forgeconda install -c anaconda ephemconda install -c anaconda scikit-learnconda install -c conda-forge seabornconda install -c plotly plotlyconda install -c conda-forge optunaconda install -c conda-forge Prophet

2. Simple ETL, and Data Visualisation

df = pd.read_csv(‘data.csv’)df2 = df.copy()
df2.head()
The data of a website’s unique visitors. The country column is the service region of the website.
The dataframe of a website’s unique visitors. The country column is the service region of the website. Image by Author
print(df2[‘Country’].value_counts(), “\n”)print(df2[‘Country’].nunique(), “unique values.”)
There are 23563 data points from Germany.
There are 23563 data points from Germany. Image by Author
df2[‘date’] = pd.to_datetime(df2[‘datepart’], dayfirst=True).dt.datedf2 = df2.loc[(df2[‘Country’]==’Germany’)] # we are only interested in the visitors from Germany in this tutorial.df_de = df2.copy()
df_de.isna().count()/df_de.count()
1.0 means 100%, and there is no null data point.
1.0 means 100%, and there is no missing data point. Image by Author
df_de2 = df_de.groupby(“date”).agg(np.sum)df_de2 = df_de2.reset_index()df_de2.columns = [‘ds’, ‘y’]df_de2 = df_de2[[‘y’, ‘ds’]]df_de2
‘y’ is the target value to be forecast, ‘ds’ is the date. They are two essential elements for the Prophet model.
‘y’ is the target value to be forecast, ‘ds’ is the date. They are two essential elements for the Prophet model. Image by Author
import plotly.io as piopio.renderers.default = “notebook”fig_line = px.line(df_de2, x=”ds”, y=”y”, title=’The number of unique visitors of www.lorentzyeung.com in the previous 3 years’)fig_line.show()
The number of unique visitors of www.lorentzyeung.com in the previous 3 years
The number of unique visitors of www.lorentzyeung.com in the previous 3 years. The domain name is fabricated. Image by Author
df_stat = df_de2.describe()mean = df_stat.loc[“mean”][“y”]std = df_stat.loc[“std”][“y”]upper = mean+std*3lower = mean-std*3print(‘ Mean: ‘, mean, ‘\n’, ‘Standard Deviation: ‘, std, ‘\n’, ‘Upper Limit: ‘, upper, ‘\n’, ‘Lower Limit:’, lower)
Now we clearly know where the mean, and outliners are. It is just my habit to use 3 Standard Deviations as the upper and lower outliner cutoff line.
Now we clearly know where the mean, and outliners are. It is just my habit to use 3 Standard Deviations as the upper and lower outliner cutoff line. Image by Author
We don’t have outliers in our dataset.
We don't have outliers in our dataset. Image by Author

3. Simple forecasting (forecast with default settings)

m = Prophet(interval_width=0.95, weekly_seasonality=False, daily_seasonality=False)m.add_country_holidays(country_name=’DE’)m.fit(df_de2)future_df = m.make_future_dataframe(periods=52,freq=’W’)forecast_default = m.predict(future_df)plot_forecast = m.plot(forecast_default)
It is simple to visualise the original data points and the forecast line in Prophet. Our website is improving in previous years, even during the pandemic era. But the Omicron seemed given it a hard hit and reverted the trend.
It is simple to visualise the original data points and the forecast line in Prophet. Our website is improving in previous years, even during the pandemic era. But the Omicron seemed to take away the momentum and revert the trend. Image by Author
plt_components = m.plot_components(forecast_default)
The component plot.
The component plot. Image by Author

Please continue reading in part 2 of the article.

--

--

TDS Archive
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Lorentz Yeung
Lorentz Yeung

Data Analyst in Microsoft, Founder of El Arte Design and Marketing, Certified Digital Marketer, MSc in Digital Marketing, London based.

No responses yet