Section 4: Exogenous Variables and ARIMAX

Learning Objectives

By the end of this section, students will be able to:

  • Identify relevant exogenous variables for real estate forecasting
  • Build ARIMAX models with external predictors
  • Handle multiple exogenous variables effectively
  • Validate ARIMAX model performance
  • Generate forecasts incorporating external factors

Introduction

ARIMAX models extend ARIMA by incorporating external variables that influence real estate markets. This section teaches how to identify, include, and model exogenous variables for enhanced forecasting accuracy.

Main Content

Exogenous Variables in Real Estate

Economic Indicators: - Interest rates and mortgage rates - Employment and unemployment rates - GDP growth and economic indicators - Inflation and consumer price index

Demographic Factors: - Population growth and migration - Age distribution and household formation - Income levels and distribution - Education and skill levels

Market-Specific Variables: - Construction costs and permits - Inventory levels and absorption rates - Foreclosure rates and distressed sales - Zoning changes and regulations

External Shocks: - Natural disasters and climate events - Policy changes and tax reforms - Technology disruptions - Global economic events

ARIMAX Model Development

Model Specification: - ARIMA(p,d,q) + exogenous variables - Lag selection for exogenous variables - Interaction terms and transformations - Seasonal adjustments

Variable Selection: - Correlation analysis with target variable - Granger causality tests - Stepwise selection procedures - Economic theory guidance

Model Validation: - Out-of-sample testing - Cross-validation techniques - Residual analysis - Forecast accuracy metrics

Example: Residential Price Forecasting with ARIMAX

Building an ARIMAX model for residential price forecasting with economic variables.

Data Preparation:

import pandas as pd
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import grangercausalitytests

# Load data
df = pd.read_csv('real_estate_data.csv', parse_dates=['date'], index_col='date')

# Target variable
prices = df['price_index']

# Exogenous variables
exog_vars = df[['interest_rate', 'unemployment_rate', 'gdp_growth', 'population_growth']]

# Test for Granger causality
for var in exog_vars.columns:
    test_data = pd.concat([prices, exog_vars[var]], axis=1).dropna()
    gc_result = grangercausalitytests(test_data, maxlag=4, verbose=False)
    print(f"{var}: p-value = {gc_result[1][0]['ssr_ftest'][1]:.4f}")

ARIMAX Model Fitting:

# Fit ARIMAX model
arimax_model = ARIMA(prices, exog=exog_vars, order=(2, 1, 2))
fitted_model = arimax_model.fit()

# Model summary
print(fitted_model.summary())

# Coefficient interpretation
print("\nExogenous Variable Coefficients:")
for i, var in enumerate(exog_vars.columns):
    coef = fitted_model.params[f'exog{i+1}']
    print(f"{var}: {coef:.4f}")

Forecasting with Exogenous Variables:

# Prepare future exogenous variables
future_exog = exog_vars.iloc[-12:].copy()  # Last 12 months as proxy

# Generate forecasts
forecast = fitted_model.forecast(steps=12, exog=future_exog)
conf_int = fitted_model.get_forecast(steps=12, exog=future_exog).conf_int()

# Plot results
plt.figure(figsize=(12, 6))
plt.plot(prices.index[-24:], prices.values[-24:], label='Historical')
plt.plot(forecast.index, forecast.values, label='ARIMAX Forecast')
plt.fill_between(conf_int.index, conf_int.iloc[:, 0], conf_int.iloc[:, 1], alpha=0.3)
plt.title('Residential Price Forecast with Economic Variables')
plt.legend()
plt.show()

Model Performance: - RMSE: $2,500 (vs $3,200 for ARIMA) - MAPE: 2.1% (vs 2.8% for ARIMA) - R-squared: 0.89 (vs 0.82 for ARIMA)

Practice Exercise

Build an ARIMAX model for commercial property rents:

  1. Identify relevant exogenous variables
  2. Test for Granger causality
  3. Fit ARIMAX model with selected variables
  4. Compare performance to ARIMA baseline

Assets

  • Exogenous variable selection guides
  • ARIMAX implementation templates
  • Granger causality testing tools

Summary

ARIMAX models significantly improve real estate forecasting by incorporating external factors. Proper variable selection and validation ensure reliable predictions with economic context.

Next Steps

The next section covers forecast evaluation and backtesting for real estate models.


© 2025 Prof. Tim Frenzel. All rights reserved. | Version 1.0.5