Time-Series Analysis and Forecasting of Foreign Exchange Rate with SARIMAX Model
Forecast the future of your foreign investments with factors that might affect the currency exchange rate using the SARIMAX model.
As promised, I am here with the second part of the Foreign Exchange Rate Time-Series Analysis and Forecasting. In this article, we will go through the SARIMAX modeling which takes factors like inflation, interest rates, balance sheets, and other financial aspects that could contribute to the fluctuation in the foreign exchange market.
SARIMAX
SARIMAX modeling requires exogenous variables that contribute to the change in the endogenous variable. This requires extensive subject matter knowledge, therefore, we restricted ourselves to the forecasting of USD/INR
only.
Exogenous Variables
We forecast foreign exchange rates using SARIMAX with inflation, interest rates, and trade transactions as exogenous variables. However, these are generic and therefore, we used indicators for each of these.
Inflation
: For inflation, we used the Consumer Price Index (CPI) as the indicator. We obtained the most significant CPI for the US and India for modeling. In the US, the total number of Urban consumers is considered to have the greatest weightage. In India, all items category of CPI has the greatest weightage.
Interest Rates
: For interest rates, we used the Federal Funds Rate for the US and 10-year Long-Term Government Bond Yields for India as indicators.
Current Account Balance
: The third indicator we used was the Balance of Payments in the current account as a percent of GDP for both countries
We gathered all the datasets from the FRED using their Python API. Here are the specific FRED series codes for each category:
Inflation
- Consumer Price Index for All Urban Consumers (US):
CPIAUCNS
- Consumer Price Index for All Items (IND):
INDCPIALLMINMEI
Interest Rates
- Effective Federal Funds Rate (US):
FEDFUNDS
- 10-Year Long-Term Government Bond Yields (IND):
INDIRLTLT01STM
Trade Transactions
- Balance of Payment — Current Account (US) as a % of GDP:
USAB6BLTT02STSAQ
- Balance of Payment — Current Account (IND) as a % of GDP:
INDB6BLTT02STSAQ
Now that we have the exogenous variables and their FRED API code, we can extract the data as we did in the previous article. I am including the code again for easy access. We extract each dataset, convert it into a data frame, and then slice it from the date 2014-01-10
to 2023-11-01
.
# Extracting the Consumer Price Index for All Urban Consumers (US)
cpi_us_monthly = fred.get_series('CPIAUCSL')
cpi_us_monthly.to_csv('data/cpi_us.csv')
# Converting into a data frame
cpi_us_df = pd.DataFrame({'CPI_US':cpi_us_monthly})
# Slicing the dataset
cpi_us_df = cpi_us_df.loc['2014-01-01':'2023-11-01']
The same steps are followed for all the above exogenous variables. For the target variable — INR/USD ratio
, we extract the monthly data as follows —
# Extracting monthly exchange rate
us_ind_monthly = fred.get_series("EXINUS")
us_ind_monthly.to_csv('data/us_ind_monthly.csv')
# Converting into a data frame
us_ind_df = pd.DataFrame({'exchange_rate':us_ind_monthly})
# Slicing the dataset
us_ind_df = us_ind_df['2014-01-01':'2023-11-01']
We have all the variables required to perform the SARIMAX modeling. However, this is where SARIMAX prediction and forecasting get complicated. We have to predict the exchange rate, however, the exogenous variables involved in the modeling should be predicted in advance! We have the data till 2023-11-01
for all the variables and we plan to forecast the exchange rate for the coming 10 months, which means we need to find a way to gather the forecasted data of all the endogenous variables!
In this work, we used the ARIMA model to forecast the endogenous variables, which are then used to forecast the exchange rate using the SARIMAX model. However, feel free to gather data from other trusted sources that give projections for the near future of the considered endogenous variables to reduce labor and maybe even for better accuracy!
As we did in the first part of the project, we will use the ARIMA model to predict and forecast the endogenous variables. I used the user-defined functions in coding this part to enhance the efficiency and organization of the codes.
We begin with finding the first difference in data points of each variables
# function that calculates the first differences
def first_difference(data):
data_diff = data.diff().dropna()
return data_diff
# The function is called for each variable
cpi_us_diff = first_difference(cpi_us_df)
cpi_ind_diff = first_difference(cpi_ind_df)
fund_rate_diff = first_difference(fund_rate_df)
int_rate_ind_diff = first_difference(int_rate_ind_df)
bop_us_diff = first_difference(bop_us_mon_df)
bop_ind_diff = first_difference(bop_ind_mon_df)
The next crucial step in ARIMA modeling is determining the p
and q
lags associated with the Auto-regressive and Moving Average parts of the model. We use the Auto-Correlation function and Partial Auto-Correlation function to determine q
and p
respectively.
from statsmodels.graphics.tsaplots import plot_acf
from statsmodels.graphics.tsaplots import plot_pacf
def acf_pacf_plots(data,title):
fig, ax = plt.subplots()
plot_acf(data, ax = ax,lags = 20, label = "Auto Correlation")
plot_pacf(data, ax = ax,lags = 20, label = "Partial Auto Correlation")
plt.title(title)
plt.legend()
plt.show()
acf_pacf_plots(cpi_us_diff, "Consumer Price Index (US)")
acf_pacf_plots(cpi_ind_diff, "Consumer Price Index (IND)")
acf_pacf_plots(fund_rate_diff, "US Federal Fund Rate")
acf_pacf_plots(int_rate_ind_diff, "Long Term Interest Rate (IND)")
acf_pacf_plots(bop_us_diff, "Balance of Payments (US)")
acf_pacf_plots(bop_ind_diff, "Balance of Payments (IND)")
The below figures give the ACF and PACF graphs for all the endogenous variables and the best p
and q
parameters chosen from the graphs.
Now, we move to the training of the ARIMA model for each of the variables to predict the test dataset, and then forecast the variables for the next 10 months.
from statsmodels.tsa.arima.model import ARIMA
# Splitting the train-test data
def train_test(data):
split_index = int(0.8*len(data))
data_train = data.iloc[:split_index]
data_test = data.iloc[split_index:]
return data_train, data_test
# Fit the model for the variable and predict
def arima(data_train, data_test, params):
start = len(data_train)
end = len(data_train) + len(data_test) - 1
arima_model = ARIMA(data_train, order = params)
arima_fit = arima_model.fit()
arima_predict = arima_fit.predict(start, end)
arima_predict.index = data_test.index
arima_pred_df = pd.DataFrame(arima_predict)
return arima_pred_df
# Plotting the graphs comparing the predicted data and test data
def pred_graphs(data_train, data_test, data_predict,title):
plt.title(title)
plt.plot(data_train, label='Train')
plt.plot(data_test, label='Test')
plt.plot(data_predict, label='Predictions')
plt.xlabel('Date')
plt.legend()
plt.show()
Once we train and test the model, we move to the forecasting phase.
# Function to forecast the variables
def forecast(data, title, params):
forecast_index = pd.date_range(start = '2023-12-01', periods =10, freq = 'MS')
arima_model = ARIMA(data, order = params)
model_fit = arima_model.fit()
model_forecast = model_fit.forecast(steps=10)
#model_forecast = pd.DataFrame(model_forecast)
model_forecast.index = forecast_index#.astype(str)
# Plotting the forecasted variables
plt.plot(data[:], label='Actual')
plt.plot(model_forecast, label='ARMA Forecast')
plt.title(title)
plt.xlabel('Date')
plt.legend()
plt.tight_layout()
plt.show()
return model_forecast
We are all set for the SARIMAX modeling and forecasting of the exchange rate! To start, we first merge all the variables into a single data frame and find the first difference in the exchange rate. Then, we plot the Auto-Correlation and Partial Auto-Correlation functions to determine the lags.
X = pd.concat([cpi_us_diff,cpi_ind_diff,fund_rate_diff,int_rate_ind_diff,bop_us_diff,bop_ind_diff], axis = 1)
X.set_index(cpi_us_diff.index)
y = us_ind_df.diff().dropna()
acf_pacf_plots(y,"US Dollar to Indian Rupees Monthly")
split_index = int(0.9*len(X))
X_train = X[:split_index]
y_train = y[:split_index]
X_test = X[split_index:]
y_test = y[split_index:]
As we obtain the lag parameters, we can use it to train and test the SARIMAX model and compare the predictions with the test data.
from statsmodels.tsa.statespace.sarimax import SARIMAX
start = len(X_train)
end = len(X_train) + len(X_test) -1
sarimax = SARIMAX(y_train, exog = X_train, order=(9,0,9))
sarimax_fit = sarimax.fit(disp=0)
sarimax_predict = sarimax_fit.predict(start, end, exog = X_test)
sarimax_predict.index = X_test.index
plt.plot(y_train, label='Train')
plt.plot(y_test, label='Test')
plt.plot(sarimax_predict, label='SARIMAX Predictions')
plt.title('USD to INR Diff SARIMAX Prediction')
plt.xlabel('Date')
plt.legend()
plt.show()
perf_metrics(y_test,sarimax_predict,"USD to INR")
And here we are, the final step of the project, forecasting the INR/USD exchange rate for 10 months using the SARIMAX model.
X_forecast = pd.concat([cpi_us_forecast,cpi_ind_forecast,fund_rate_forecast,int_rate_forecast,bop_us_forecast,bop_ind_forecast], axis =1)
X_forecast.index = cpi_us_forecast.index
sarimax_model = SARIMAX(y,order = (9,0,9))
sarimax_model_fit = sarimax_model.fit(disp = 0)
sarimax_forecast = sarimax_model_fit.forecast(steps =10, exog = X_forecast)
sarimax_forecast.index = X_forecast.index
plt.plot(y, label='Historical')
plt.plot(sarimax_forecast, label='SARIMAX Forecast')
plt.title('USD to INR Exchange Rate Diff SARIMAX Forecast')
plt.xlabel('Date')
plt.legend()
plt.show()
What we have in the above figure is the differenced values of the forecasts. Therefore, we have to convert them back to actual values by consecutive addition — the opposite of the first difference.
xchange_rate = us_ind_df['exchange_rate'].iloc[-1] + sarimax_forecast.cumsum()
sarimax_ind_forecasts = pd.concat([us_ind_df['exchange_rate'], xchange_rate], axis = 0)
sarimax_ind_forecasts.to_csv('data/sarimax_ind_forecasts.csv')
plt.plot(us_ind_df, label='Historical')
plt.plot(xchange_rate, label='SARIMAX Forecast')
plt.title('USD to INR Exchange Rate SARIMAX Forecast')
plt.xlabel('Date')
plt.legend()
plt.show()
We can see that the INR/USD exchange rate is on a constant increase throughout the forecasted period except for a slight dip in July 2024. We have obtained the performance metrics of the three models compared and the SARIMAX model outperforms the others.
The following observations are made from the three models. The ARIMA and SARIMAX models tend to give a similar trend whereas the SARIMA model varies from this trend.
The last part of this project was to develop an interactive Power BI dashboard for each of the currencies. Below is the dashboard for USD to INR conversion rates as forecasted by the three models. We can use different filters to enhance the visual analysis. In the figure, we used the data from January 2023 to October 2023 and the forecasting till March 2024.
The SARIMAX model is dependent on the forecasted data using the ARIMA model of the endogenous variables which can affect the accuracy of the forecast. This is a severe drawback of this approach. Obtaining forecasted data from other financial sources can help minimize the accumulation of errors in the model.
Summary & Conclusion
I would like to summarize the whole project — including the findings from the Part I.
- Among all the currencies, the Chinese Yuan tends to be in the strengthening phase against US Dollars.
- The strengthening phase of the Chinese Yuan will be a good opportunity for investing in U.S. exporting companies dealing with Chinese Exports as they can sell the goods at a lower price in China.
- The Chinese buyers can get more goods from the US Exporters for the same Chinese Yuan during the Weak Dollar phase.
- The US can attract more tourists from China during the weak dollar phase and it can be a boon for the tourism industry in the US.
- For the other countries and the EU, it’s a strong dollar period and they would gain from exporting goods to the US.
- A US Investor investing in a Chinese company will gain; A US investor investing in India, the UK, or the EU will suffer a loss!
The foreign exchange rate between two countries determines the trade relations and investment strategies between the traders and investors in these countries.
When a weak dollar uplifts the investment, exporting and tourism industry in the US, it affects imports and domestic population by inflation.
References
https://medium.com/@dagorhan20/usd-try-next-30-days-with-sarimax-a11bbb4a7a00
Thank you! Feel free to visit my GitHub page to learn more about this project and connect with me on LinkedIn!