Time Series Chapter 4 : Resampling, Frequency Conversion & Rolling Statistics

In the last lecture, we learned how to clean and prepare our time series data, fixing missing dates, removing duplicates, handling outliers, and making sure everything was consistent.

Now that our dataset is reliable, it's time to explore it more deeply. Real insights often come not from the raw data itself, but from how we view it across different time scales — daily, weekly, or monthly.

In this lecture, we'll learn how to resample and aggregate our time series to different frequencies, and how to use rolling statistics like moving averages to smooth noise and reveal clearer patterns.

These terms will help us uncover long term trends, measure stability, and prepare features that will be essential later for forecasting and project work.

Resampling & Frequency Conversion

Now that our data is clean and complete, one of the most powerful things we can do is change its time frequency, this process is called resampling.

Resampling lets us look at the same data from different time perspectives. For example:

You can downsample daily data to weekly or monthly averages to see long-term trends.
Or upsample hourly data to daily to align with other datasets.

Creating a Simple Daily Temperature Dataset

We'll simulate one year of daily temperature readings with a smooth seasonal pattern and a bit of random noise.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Create a full year of daily dates
dates = pd.date_range(start='2023-01-01', end='2023-12-31', freq='D')

# Simulate temperature with a yearly sinusoidal pattern + random noise
temps = 15 + 10 * np.sin(np.linspace(0, 3 * np.pi, len(dates))) + np.random.normal(0, 1.5, len(dates))

# Create the DataFrame
data = pd.DataFrame({'Date': dates, 'Temp': temps})
data = data.set_index('Date')

# Quick preview
print(data.head())

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Create a full year of daily dates
dates = pd.date_range(start='2023-01-01', end='2023-12-31', freq='D')

# Simulate temperature with a yearly sinusoidal pattern + random noise
temps = 15 + 10 * np.sin(np.linspace(0, 3 * np.pi, len(dates))) + np.random.normal(0, 1.5, len(dates))

# Create the DataFrame
data = pd.DataFrame({'Date': dates, 'Temp': temps})
data = data.set_index('Date')

# Quick preview
print(data.head())

This produces realistic looking daily temperatures between roughly 5 °C and 25 °C, perfect for testing resampling methods.

Downsampling : Daily → Weekly → Monthly

Downsampling means reducing the number of data points by grouping values into larger time buckets (weeks, months, quarters, etc.) and applying an aggregation like mean or sum.

# Weekly and monthly averages
weekly_data = data['Temp'].resample('W').mean()
monthly_data = data['Temp'].resample('M').mean()

# Plot the difference
plt.figure(figsize=(12,5))
plt.plot(data.index, data['Temp'], label='Daily Data', alpha=0.4)
plt.plot(weekly_data.index, weekly_data, label='Weekly Average', color='orange')
plt.plot(monthly_data.index, monthly_data, label='Monthly Average', color='red')
plt.title('Downsampling: Daily → Weekly → Monthly')
plt.xlabel('Date'); plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

# Weekly and monthly averages
weekly_data = data['Temp'].resample('W').mean()
monthly_data = data['Temp'].resample('M').mean()

# Plot the difference
plt.figure(figsize=(12,5))
plt.plot(data.index, data['Temp'], label='Daily Data', alpha=0.4)
plt.plot(weekly_data.index, weekly_data, label='Weekly Average', color='orange')
plt.plot(monthly_data.index, monthly_data, label='Monthly Average', color='red')
plt.title('Downsampling: Daily → Weekly → Monthly')
plt.xlabel('Date'); plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

Explanation:

resample('W') groups by week, and resample('M') groups by month.
.mean() calculates the average within each group.

Where it's used:

To smooth noisy daily data (like sales, traffic, or temperature).
To visualize long-term patterns and seasonality.

Plot interpretation: The daily line is wavy and detailed, while the weekly and monthly lines get progressively smoother, revealing the main seasonal curve.

Upsampling — Daily → 12-Hour Intervals

Upsampling increases the frequency by inserting new timestamps between existing ones. Since those new points don't have real values, we usually fill them using forward fill (ffill) or interpolation.

# Upsample to 12-hour frequency and fill missing values
upsampled = data['Temp'].resample('12H').ffill()

# Plot comparison
plt.figure(figsize=(12,5))
plt.plot(data.index, data['Temp'], 'o-', label='Daily Data', alpha=0.6)
plt.plot(upsampled.index, upsampled, label='12-Hour Upsampled (Forward Fill)', color='green')
plt.title('Upsampling: Daily → 12-Hour Intervals')
plt.xlabel('Date'); plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

# Upsample to 12-hour frequency and fill missing values
upsampled = data['Temp'].resample('12H').ffill()

# Plot comparison
plt.figure(figsize=(12,5))
plt.plot(data.index, data['Temp'], 'o-', label='Daily Data', alpha=0.6)
plt.plot(upsampled.index, upsampled, label='12-Hour Upsampled (Forward Fill)', color='green')
plt.title('Upsampling: Daily → 12-Hour Intervals')
plt.xlabel('Date'); plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

Explanation:

resample('12H') creates new timestamps every 12 hours.
.ffill() fills them with the most recent known temperature.

Where it's used:

When merging datasets with different frequencies (e.g., daily weather + hourly energy usage).
When plotting smoother curves or aligning time axes for analysis.

Plot interpretation: The upsampled line (green) follows the same overall path as the daily data but appears smoother and denser, giving the illusion of higher-resolution readings.

Rolling Windows & Moving Averages

Resampling helped us look at the data from different time perspectives — weekly, monthly, or yearly. Now, let's learn another powerful technique to smooth short term fluctuations and better visualize trends: rolling statistics.

Rolling statistics (also called moving averages) compute metrics like mean or standard deviation over a sliding window of time. This helps highlight the underlying signal by reducing noise.

1. Rolling Mean : Smoothing Fluctuations

Let's calculate the 7-day and 30-day moving averages of our temperature data to compare short-term and long-term smoothing.

# 7-day and 30-day rolling averages
data['Rolling_7'] = data['Temp'].rolling(window=7).mean()
data['Rolling_30'] = data['Temp'].rolling(window=30).mean()

# Plot
plt.figure(figsize=(12,5))
plt.plot(data.index, data['Temp'], label='Daily Data', alpha=0.4)
plt.plot(data.index, data['Rolling_7'], label='7-Day Moving Avg', color='orange')
plt.plot(data.index, data['Rolling_30'], label='30-Day Moving Avg', color='red')
plt.title('Rolling Mean: 7-Day vs 30-Day Window')
plt.xlabel('Date'); plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

# 7-day and 30-day rolling averages
data['Rolling_7'] = data['Temp'].rolling(window=7).mean()
data['Rolling_30'] = data['Temp'].rolling(window=30).mean()

# Plot
plt.figure(figsize=(12,5))
plt.plot(data.index, data['Temp'], label='Daily Data', alpha=0.4)
plt.plot(data.index, data['Rolling_7'], label='7-Day Moving Avg', color='orange')
plt.plot(data.index, data['Rolling_30'], label='30-Day Moving Avg', color='red')
plt.title('Rolling Mean: 7-Day vs 30-Day Window')
plt.xlabel('Date'); plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

Explanation:

.rolling(window=7).mean() slides a 7-day window through the data, averaging each group.
The 7-day average reacts quickly to changes (captures weekly cycles).
The 30-day average moves slower but shows a smoother, clearer trend.

Plot interpretation: The orange line (7-day) still follows the ups and downs of daily variation, while the red line (30-day) filters out noise, showing the broader seasonal movement.

2. Rolling Standard Deviation : Measuring Volatility

We can also track rolling standard deviation to measure how much the data fluctuates over time, useful for identifying unstable periods.

# Rolling standard deviation (30-day window)
data['Rolling_STD'] = data['Temp'].rolling(window=30).std()

# Plot
plt.figure(figsize=(12,4))
plt.plot(data.index, data['Rolling_STD'], color='purple')
plt.title('Rolling 30-Day Standard Deviation (Volatility)')
plt.xlabel('Date'); plt.ylabel('Std. Deviation')
plt.show()

# Rolling standard deviation (30-day window)
data['Rolling_STD'] = data['Temp'].rolling(window=30).std()

# Plot
plt.figure(figsize=(12,4))
plt.plot(data.index, data['Rolling_STD'], color='purple')
plt.title('Rolling 30-Day Standard Deviation (Volatility)')
plt.xlabel('Date'); plt.ylabel('Std. Deviation')
plt.show()

Explanation:

.rolling(window=30).std() calculates how much temperatures vary within each 30-day span.
High peaks = more volatile periods (temperature changing rapidly).
Low values = stable, predictable behavior.

3. Expanding Windows : Cumulative Averages

An expanding window starts from the first data point and keeps growing, useful for observing long term stabilization.

# Expanding mean: cumulative average from start
data['Expanding_Mean'] = data['Temp'].expanding().mean()

plt.figure(figsize=(12,5))
plt.plot(data.index, data['Temp'], label='Daily Data', alpha=0.3)
plt.plot(data.index, data['Expanding_Mean'], color='green', label='Expanding Mean')
plt.title('Expanding Mean — Long-Term Average Trend')
plt.xlabel('Date'); plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

# Expanding mean: cumulative average from start
data['Expanding_Mean'] = data['Temp'].expanding().mean()

plt.figure(figsize=(12,5))
plt.plot(data.index, data['Temp'], label='Daily Data', alpha=0.3)
plt.plot(data.index, data['Expanding_Mean'], color='green', label='Expanding Mean')
plt.title('Expanding Mean — Long-Term Average Trend')
plt.xlabel('Date'); plt.ylabel('Temperature (°C)')
plt.legend()
plt.show()

Interpretation: The expanding mean gradually stabilizes as more data is added, showing how the long-term average becomes more reliable over time.

Summary

In this lecture, we learned how to view time series data through multiple time lenses, from raw daily values to smooth long term trends. We explored how resampling and rolling statistics help uncover hidden structure in the data and prepare it for advanced modeling.

Key Takeaways

Resampling lets you change time frequency:

Downsampling (e.g., daily → weekly/monthly) smooths short-term noise.
Upsampling (e.g., daily → hourly) adds granularity for alignment or visualization.

Rolling statistics provide a moving view of your data:

Rolling mean reveals trends by smoothing fluctuations.
Rolling standard deviation highlights volatility or instability.
Expanding metrics track long-term averages over time.

Together, these concepts help you:

Understand data behavior across different time scales.
Extract cleaner trends for analysis or feature engineering.
Build intuition for the next steps, time series decomposition and forecasting.

Check our full list on Time Series:

Time Series Concepts in ML Edit description

Contents

Resampling & Frequency Conversion

Creating a Simple Daily Temperature Dataset

Downsampling : Daily → Weekly → Monthly

Upsampling — Daily → 12-Hour Intervals

Rolling Windows & Moving Averages

1. Rolling Mean : Smoothing Fluctuations

2. Rolling Standard Deviation : Measuring Volatility

3. Expanding Windows : Cumulative Averages

Summary

Key Takeaways