Section 1: Time Series Fundamentals, Decomposition & Evaluation
Learning Objectives
By the end of this section, students will be able to:
- Identify non-stationarity, autocorrelation, and heteroscedasticity in real estate time series
- Decompose property data into trend, seasonal, cyclical, and irregular components using Python
- Build baseline forecasts by projecting trend and seasonal patterns forward
- Evaluate forecast accuracy using MAE, RMSE, and MAPE metrics
- Implement rolling-origin backtesting to validate model reliability across time
- Apply a complete evaluation framework to any forecasting model
Why Time Series Analysis Matters for Real Estate
Real estate markets move through time with recognizable patterns. Property values follow long-term trends shaped by economic growth, population shifts, and urban development. Seasonal patterns emerge as buyers prefer spring and summer transactions. Market cycles span 4-7 years, driven by interest rates, credit availability, and investor sentiment.
Understanding these patterns allows investors to time acquisitions, developers to plan project schedules, and analysts to generate reliable forecasts. The key question is: How can you separate signal from noise in real estate data?
This section teaches a complete forecasting workflow: understand the data, build a model, and evaluate its performance. You will learn to decompose time series into components, generate forecasts, and measure accuracy using industry-standard metrics.
Time Series Components in Real Estate Markets
Every real estate time series contains four distinct components. Separating these components reveals the underlying structure of market behavior.
Trend Component
The trend represents the long-term direction of property values. In most markets, real estate prices increase over time due to inflation, economic growth, and land scarcity. However, trends can be negative during prolonged recessions or population decline.
Key characteristics:
- Moves slowly compared to seasonal patterns
- Reflects fundamental economic drivers (GDP, employment, population)
- Can change direction at major turning points
- May be linear or nonlinear
Example: A metropolitan area with steady job growth shows a consistent upward trend in commercial rents over a decade, interrupted only by the 2008-2009 recession.
Seasonal Component
Seasonality creates regular patterns within each year. Residential real estate shows strong seasonal effects, with transaction volumes and prices peaking in late spring and summer. Commercial real estate exhibits weaker but still visible seasonal patterns.
Key characteristics:
- Repeats annually with similar magnitude
- Driven by school schedules, weather, and holidays
- Stronger in residential than commercial markets
- Can be modeled and removed for clearer trend analysis
Example: Single-family home prices in suburban markets consistently rise 3-5% from January to June, then flatten or decline slightly through winter months.
Cyclical Component
Cycles span multiple years and reflect business cycles, credit cycles, and supply-demand imbalances. Real estate cycles typically last 4-7 years from trough to trough, longer than seasonal patterns but shorter than multi-decade trends.
Key characteristics:
- Irregular duration (unlike fixed annual seasonality)
- Driven by interest rates, credit availability, and investor sentiment
- Synchronized with broader economic cycles
- Difficult to predict timing but recognizable in retrospect
Example: The 2003-2007 housing boom followed by the 2008-2012 bust represents a complete real estate cycle, driven by loose credit standards and subsequent tightening.
Irregular Component
The irregular component (also called residual or noise) captures random fluctuations and one-time events. This includes measurement errors, natural disasters, policy shocks, and other unpredictable factors.
Key characteristics:
- No recognizable pattern
- Cannot be forecasted
- Should be small relative to other components
- Large residuals indicate missing explanatory variables
Example: A sudden spike in apartment vacancies due to a major employer relocating appears as an irregular shock in the rental price series.
Real Estate Time Series Characteristics
Real estate data exhibits three statistical properties that complicate forecasting. Understanding these characteristics helps you choose appropriate models and transformations.
What it is: Statistical properties (mean, variance, autocorrelation) change over time rather than remaining constant.
Real estate pattern: Property values trend upward persistently. Variance increases during boom periods and contracts during downturns.
Example: Home prices rise from $200,000 to $400,000 over 20 years (non-stationary). Year-over-year changes fluctuate around 3-5% with stable variance (stationary after differencing).
Why it matters: Non-stationary series produce spurious correlations and unreliable forecasts. Most models assume stationarity. Transform data through differencing or detrending before modeling.
Autocorrelation
What it is: Current values depend on past values. Today’s price is correlated with yesterday’s price.
Real estate pattern: Strong positive autocorrelation at short lags (1-3 months) due to market momentum, sticky prices, and slow information diffusion. Properties that appreciate one month tend to appreciate the next month.
Example: If office rents increased 2% last quarter, they are more likely to increase again this quarter than to decline, even controlling for economic fundamentals.
Why it matters: Autocorrelation provides forecasting power but violates standard regression assumptions. Time series models like ARIMA explicitly capture and model this autocorrelation structure.
What it is: Variance changes over time. Markets alternate between calm periods (low volatility) and turbulent periods (high volatility).
Real estate pattern: Volatility clusters together. High-volatility periods follow high-volatility periods. Calm markets stay calm, then suddenly shift to turbulent periods.
Example: During 2000-2006, monthly home price changes had standard deviation of 0.5%. During 2008-2010, standard deviation increased to 2.5%, indicating heightened uncertainty.
Why it matters: Standard models assume constant variance. Heteroscedasticity leads to inefficient estimates and incorrect confidence intervals. Advanced models like GARCH explicitly model changing variance.
See these concepts in action with real estate data. Toggle between non-stationarity, autocorrelation, and heteroscedasticity to understand how each manifests in property markets.
Time Series Decomposition: Separating Components
Decomposition breaks a time series into trend, seasonal, and residual components. This reveals underlying patterns and enables simple forecasting by projecting each component forward.
Additive vs Multiplicative Decomposition
Two decomposition models exist:
Additive model: Assumes components add together.
\[Y_t = T_t + S_t + R_t\]
where: - \(Y_t\) = observed value at time \(t\) - \(T_t\) = trend-cycle component at time \(t\) - \(S_t\) = seasonal component at time \(t\) - \(R_t\) = residual (irregular) component at time \(t\)
Multiplicative model: Assumes components multiply together.
\[Y_t = T_t \times S_t \times R_t\]
When to use each model:
- Use additive when seasonal fluctuations remain constant regardless of trend level
- Use multiplicative when seasonal fluctuations grow proportionally with trend level
- Real estate typically uses multiplicative because higher price levels produce larger absolute seasonal swings
Example: A $200,000 home with 5% seasonal variation swings ±$10,000. A $400,000 home with 5% seasonal variation swings ±$20,000. Seasonal magnitude grows with price level. This growth pattern indicates multiplicative decomposition is appropriate.
Decomposition in Python
Python automates decomposition with the seasonal_decompose() function. Load your data, apply the decomposition, and extract the components:
from statsmodels.tsa.seasonal import seasonal_decompose
import pandas as pd
# Load and decompose
df = pd.read_csv('case_shiller_index.csv', parse_dates=['date'], index_col='date')
decomposition = seasonal_decompose(df['price_index'], model='multiplicative', period=12)
# Extract components
trend = decomposition.trend
seasonal = decomposition.seasonal
residual = decomposition.residThe decomposition separates your series into three components: trend (long-term direction), seasonal (repeating patterns), and residual (random variation). Use decomposition.plot() to visualize all components at once.
Experiment with different real estate datasets and decomposition models to build intuition. Toggle components on/off to see how they contribute to the overall pattern. Hover over any chart to see exact values and how components combine at each time point.
Try this: Switch between residential prices and commercial rents to see how seasonality differs. Toggle between multiplicative and additive models to observe how the choice affects component separation.
Building Forecasts from Decomposition
Decomposition provides a simple forecasting method: project each component forward and recombine them. This establishes the forecasting workflow you will use throughout the module.
Three-step process:
- Project the trend: Use linear extrapolation to extend the trend component forward
- Repeat seasonal pattern: Apply the 12-month seasonal indices to future periods
- Combine components: Multiply trend and seasonal forecasts (multiplicative) or add them (additive)
from scipy import stats
import numpy as np
# 1. Project trend forward using linear regression
recent_trend = trend.dropna().iloc[-24:]
slope, intercept = stats.linregress(np.arange(len(recent_trend)), recent_trend)[:2]
trend_forecast = slope * np.arange(24, 36) + intercept
# 2. Repeat seasonal pattern
seasonal_forecast = np.tile(seasonal.iloc[:12].values, 1)
# 3. Combine (multiplicative model)
forecast = trend_forecast * seasonal_forecastThis baseline forecast sets the performance benchmark. More sophisticated models like ARIMA and Prophet should outperform this simple approach.
Watch the three-step process animate in real time. Toggle between multiplicative and additive models to see how combination methods affect the final forecast.
Forecast Evaluation Framework
Building a forecast is only half the job. Evaluating forecast accuracy determines whether your model is useful for decision-making. Without rigorous evaluation, you cannot know if your forecasts are better than simple guesses.
This evaluation framework applies to all forecasting models: decomposition, ARIMA, Prophet, machine learning, and judgmental forecasts. Establishing it now means you assess every model consistently throughout the module.
Why Forecast Evaluation Matters
Real estate decisions depend on forecast quality. An acquisition underwriting model that uses optimistic rent forecasts may indicate a profitable investment when reality yields losses. A development pro forma with inaccurate absorption rate forecasts can lead to overbuilding and financial distress.
Key questions evaluation answers:
- How accurate are my forecasts on average?
- Do forecasts systematically over-predict or under-predict?
- Are forecasts more accurate for near-term or long-term horizons?
- Does this model outperform simpler alternatives?
Train-Test Split Methodology
Evaluation requires holdout data: data not used to build the model. You compare forecasts against these known-but-unused values to simulate real-world forecasting performance.
Standard approach:
- Split data into training set (typically 80%) and test set (typically 20%)
- Build model using only training data
- Generate forecasts for the test period
- Calculate accuracy metrics by comparing forecasts to actual test values
Important: Never train a model on the full dataset then evaluate on the same data. This produces artificially optimistic metrics that do not reflect real-world performance.
Forecast Accuracy Metrics
Three metrics form the industry standard for forecast evaluation. Each measures accuracy differently and provides complementary information.
Mean Absolute Error (MAE)
Measures average absolute forecast error in original units. Treats all errors equally.
Interpretation: MAE = 5.0 means average forecast error is 5.0 index points.
Use when: You want intuitive, easy-to-explain accuracy in original units. Less sensitive to outliers.
Root Mean Squared Error (RMSE)
Measures average forecast error with more weight on large errors due to squaring.
Interpretation: RMSE = 7.0 means root-mean-squared error is 7.0 index points. Always ≥ MAE.
Use when: Large errors are costly. Common in academic research. RMSE/MAE ratio reveals outlier presence.
Mean Absolute Percentage Error (MAPE)
Measures average forecast error as percentage of actual values. Scale-independent.
Interpretation: MAPE = 2.5% means forecasts miss by 2.5% on average.
Use when: Comparing accuracy across different series. Industry standard. Fails when actual values equal zero.
Forecast Horizon Considerations
Forecast accuracy typically declines as the forecast horizon (how far ahead you forecast) increases. Near-term forecasts (1-3 months) tend to be more accurate than long-term forecasts (12-24 months).
Always report metrics separately for different horizons:
- 1-month ahead: Most accurate, useful for short-term tactical decisions
- 3-month ahead: Moderate accuracy, useful for quarterly planning
- 12-month ahead: Lower accuracy, useful for annual budgeting
- 24-month ahead: Lowest accuracy, useful for strategic planning
Evaluating Forecasts in Python
Implement the evaluation framework in Python with a reusable function:
import numpy as np
def calculate_metrics(actual, forecast):
"""Calculate MAE, RMSE, and MAPE."""
errors = actual - forecast
mae = np.mean(np.abs(errors))
rmse = np.sqrt(np.mean(errors ** 2))
mape = np.mean(np.abs(errors / actual) * 100)
return {'MAE': mae, 'RMSE': rmse, 'MAPE': mape}Apply to Test Data
# Build forecast on training data, evaluate on test data
train_decomp = seasonal_decompose(train_data['price_index'], model='multiplicative', period=12)
# Project components forward
trend_forecast = extrapolate_trend(train_decomp.trend) # Your linear regression code
seasonal_forecast = np.tile(train_decomp.seasonal.iloc[:12], len(test_data) // 12 + 1)[:len(test_data)]
forecast = trend_forecast * seasonal_forecast
# Evaluate
metrics = calculate_metrics(test_data['price_index'].values, forecast)
print(f"MAE: {metrics['MAE']:.2f}, RMSE: {metrics['RMSE']:.2f}, MAPE: {metrics['MAPE']:.1f}%")Visualize Results
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 5))
plt.plot(test_data.index, test_data['price_index'], label='Actual', linewidth=2)
plt.plot(test_data.index, forecast, label='Forecast', linestyle='--', linewidth=2)
plt.axvline(x=test_data.index[0], color='gray', linestyle=':', alpha=0.5)
plt.legend()
plt.title('Forecast vs Actual')
plt.show()Check that residuals (actual - forecast) fluctuate randomly around zero with no patterns. Trends indicate model bias. Plot residuals over time and create a histogram to assess distribution symmetry.
Basic Backtesting: Simulating Real-World Forecasting
The train-test split evaluates forecast accuracy at one point in time. Backtesting extends evaluation across multiple time periods to assess consistency and reliability.
Rolling-Origin Evaluation
Rolling-origin evaluation (also called rolling window backtesting) repeatedly simulates the forecasting process:
- Train model on data up to time \(t\)
- Forecast period \(t+1\)
- Advance time by one period
- Repeat steps 1-3 for multiple time points
This produces multiple forecast accuracy measurements. These measurements reveal whether performance is stable or varies across market conditions.
Example: Train on 2010-2019 data, forecast 2020 Q1. Then train on 2010-2020 Q1, forecast 2020 Q2. Continue through all quarters.
Python Implementation
def rolling_backtest(data, train_window, forecast_horizon, model_func):
"""Perform rolling-origin forecast evaluation."""
results = []
for i in range(train_window, len(data) - forecast_horizon):
train = data.iloc[:i]
actual = data.iloc[i:i+forecast_horizon]
forecast = model_func(train, forecast_horizon)
metrics = calculate_metrics(actual.values, forecast)
results.append(metrics)
return pd.DataFrame(results)
# Run backtesting
results = rolling_backtest(df['price_index'], train_window=60, forecast_horizon=1,
model_func=your_forecast_function)
print(f"Average MAPE: {results['MAPE'].mean():.1f}% ± {results['MAPE'].std():.1f}%")Plot MAE and MAPE over time to identify periods where forecasting was difficult. Stable metrics indicate consistent performance. Spikes reveal challenging periods like recessions or policy shocks.
When Backtesting Matters
Backtesting is critical for:
- Production forecasting systems: Assess reliability before deployment
- Model comparison: Determine which model performs most consistently
- Market regime analysis: Identify if models fail during specific market conditions
- Confidence intervals: Estimate forecast uncertainty based on historical performance
For academic exercises, single train-test split suffices. For real-world deployment, backtesting is mandatory.
Complete Forecasting Cycle: Integration and Workflow
You now possess a complete forecasting workflow applicable to any model:
- Understand the data: Identify trend, seasonal, and cyclical components through decomposition
- Build a model: Project components forward to generate forecasts
- Evaluate performance: Calculate MAE, RMSE, and MAPE on holdout data
- Validate reliability: Use backtesting to assess consistency across time
This cycle repeats throughout the module. When you learn ARIMA models, you will decompose data to understand it, build ARIMA forecasts, evaluate with the same metrics, and backtest for reliability. When you learn Prophet, you will follow the same workflow.
Common Pitfalls in Time Series Forecasting
These pitfalls destroy forecast reliability. Master these to build production-quality models.
1. Training on Full Dataset
Never evaluate on training data. This produces artificially optimistic metrics. Always hold out test data.
2. Ignoring Non-Stationarity
Forecasting non-stationary series without transformation produces unreliable results. Check stationarity first, then difference or detrend.
3. Overfitting to Noise
Complex models fit training data perfectly but fail on new data. Simpler models often forecast better.
4. Using MAPE with Near-Zero Values
MAPE becomes unstable when actual values approach zero. Use MAE or RMSE for series with small values.
5. Extrapolating Trends Too Far
Linear extrapolation works for short horizons (1-3 months) but fails beyond 12 months. Real estate trends reverse at cycle turning points.
6. Ignoring External Factors
Decomposition assumes the past predicts the future. Major shocks (policy changes, pandemics, crises) break historical patterns. Use ARIMAX to incorporate external variables.
7. Focusing Only on Point Forecasts
Decision-makers need uncertainty, not just point estimates. Always provide confidence intervals or scenario ranges.
Applying the Framework to Real Estate Decisions
How does this forecasting cycle improve real estate decision-making?
Acquisition analysis: An investor who evaluates a residential portfolio generates 12-month rent forecasts with decomposition methods. Evaluation shows MAPE of 3.2%, which indicates reliable forecasts. Backtesting confirms stable performance across different market conditions. The investor uses these forecasts in a discounted cash flow model to determine fair acquisition price.
Development timing: A developer planning a multifamily project needs 18-month construction cost forecasts. Decomposition reveals strong upward trend in materials costs. Evaluation shows declining accuracy beyond 12 months (MAPE increases from 2.5% to 4.8%). The developer decides to lock in material contracts early rather than risking higher future costs.
Portfolio strategy: A REIT compares decomposition forecasts against analyst consensus forecasts for net operating income. Evaluation shows decomposition achieves significantly lower MAPE (2.1% vs 3.5%) and more stable performance across market conditions. The REIT adjusts internal forecasts to incorporate decomposition outputs. This adjustment improves capital allocation decisions.
Market entry: An investor considers entering a new metropolitan market. Backtesting reveals decomposition forecasts had MAPE over 8% during the 2008-2010 period but under 3% in stable periods. The investor recognizes forecast uncertainty increases during market stress and adjusts risk premiums accordingly.
These applications demonstrate why evaluation matters. Forecasts without evaluation metrics are guesses. Forecasts with rigorous evaluation become decision tools.
Practice with Real Market Data
Apply the techniques from this section to real Bloomberg time series data. The following datasets contain approximately 20 years (2005-2025) of monthly real estate market data, including price indices, market activity indicators, and financial market variables. Each dataset provides 240+ observations with complete time series and no missing values. Your objective is to build a forecasting model that achieves the lowest possible RMSE on the test set.
Select one dataset to start your forecasting practice:
Price Indices Dataset
Contains Case-Shiller Home Price Index and CPI-Shelter data for trend and cycle analysis.
Download Price Indices Data
Market Activity Dataset
Contains New Home Sales, Construction Spending, and Construction Employment data for seasonal pattern analysis.
Download Market Activity Data
Financial Markets Dataset
Contains 30-Year Mortgage Rates and REIT Index data for exogenous variable modeling.
Download Financial Markets Data
© 2025 Prof. Tim Frenzel. All rights reserved. | Version 1.0.5