Gaussian Smoothing

Few mathematical objects carry a single person's name as firmly as the Gaussian function carries that of Carl Friedrich Gauss (1777--1855), the German mathematician often called the "Prince of Mathematicians." Yet the bell-shaped curve was not his invention. In 1733, the French mathematician Abraham de Moivre derived it as an approximation to the binomial distribution, publishing the result in a supplement to his Doctrine of Chances. What Gauss did, seven decades later, was give the curve its deepest physical meaning. In 1801, when the asteroid Ceres was discovered by Giuseppe Piazzi and then lost in the glare of the Sun after only 41 days of observations, Gauss used least-squares fitting -- assuming that measurement errors follow a bell-shaped distribution -- to predict where Ceres would reappear. When astronomers pointed their telescopes to his predicted position in December 1801, there it was. He published the full mathematical framework in Theoria Motus Corporum Coelestium (1809), cementing the normal distribution at the heart of error analysis and earning it the name "Gaussian."

The function itself, $e^{- x^{2} / (2 σ^{2})}$ , turned out to have extraordinary mathematical properties. It is infinitely differentiable, it is its own Fourier transform (a Gaussian in the time domain is also a Gaussian in the frequency domain), and it is separable (a 2D Gaussian kernel factors into two independent 1D operations). These properties would matter enormously once digital signal processing took shape in the mid-20th century. Engineers needed smoothing kernels -- weight functions to average out noise -- and the Gaussian became the natural choice. Unlike a flat boxcar window, a Gaussian kernel gives the most weight to the center point and tapers off smoothly, producing results free of the abrupt artifacts that equal-weight averaging can introduce.

The idea of using weighted kernels for smoothing was formalized in the statistics literature by Emanuel Parzen (1962) and Murray Rosenblatt (1956) in the context of density estimation, where the Gaussian kernel became the default choice. From there it spread to image processing -- Gaussian blur is the standard pre-processing step in computer vision -- and to spectroscopy, where it offers a simple, single-parameter alternative to the moving average while producing smoother, more natural curves.

The weighted average problem

You've learned about moving average, which averages neighboring points equally. But this has an obvious flaw:

Why should a point 5 steps away matter as much as your immediate neighbor?

Intuitively, closer points should have more influence than distant points. A measurement taken right next to you is more relevant than one taken far away. This is the motivation for weighted averaging.

Gaussian smoothing solves this problem elegantly using the famous bell curve (Gaussian function) to weight the neighbors. Points closer to the center get higher weight, and the weight falls off smoothly as you move away.

The result? Smoother, more natural-looking curves without the "blocky" artifacts that moving average can sometimes produce.

The big idea: weight by distance

Instead of equally weighting all points in a window, Gaussian smoothing assigns weights based on a Gaussian (bell curve) function:

w (x) = \frac{1}{σ 2 π} e^{- \frac{x ^{2}}{2 σ ^{2}}}

Where:

x is the distance from the center point
σ (sigma) controls the width of the bell curve

The smoothed value at point i becomes:

\overset{y}{^}_{i} = \frac{\sum _{j} w ( j - i ) \cdot y _{j}}{\sum _{j} w ( j - i )}

In plain English: Each neighbor gets multiplied by its weight, you sum them up, then divide by the sum of weights (to normalize).

Visualizing the Gaussian kernel

The Gaussian function creates a bell-shaped curve:

At the center (x=0): Maximum weight
Moving away: Weight decreases smoothly
Far from center: Weight approaches zero

The σ parameter controls how quickly the weight decreases:

Small σ: Narrow bell, only very close neighbors matter
Large σ: Wide bell, even distant neighbors get significant weight

How σ controls smoothing

The σ (sigma) parameter is the key to Gaussian smoothing. It's called the "standard deviation" of the Gaussian, and it controls the width of the bell curve.

Small σ (e.g., σ = 0.5 to 1.0):

Narrow Gaussian bell
Only immediate neighbors get significant weight
Light smoothing
Fine features preserved

Medium σ (e.g., σ = 1.5 to 3.0):

Moderate Gaussian bell
Neighbors within 2-3 points get good weight
Moderate smoothing
Good balance for typical data

Large σ (e.g., σ = 4.0 to 6.0):

Wide Gaussian bell
Many neighbors contribute
Heavy smoothing
Features may be blurred

The 3-sigma rule

A helpful fact from statistics: 99.7% of a Gaussian's weight falls within ±3σ of the center. This means:

Points beyond 3σ contribute almost nothing
The effective window size is approximately 6σ to 7σ

Example: If σ = 2, the effective window is about 12-14 points.

This helps you choose σ:

Want a window of ~10 points? Use σ ≈ 1.5 to 2.0
Want a window of ~20 points? Use σ ≈ 3.0 to 3.5

Comparison with moving average

Let's compare Gaussian smoothing with moving average on the same window size:

Moving Average (window = 11):

Weights: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1] / 11
         All equal!

Gaussian (σ = 2, effective window ≈ 13):

Weights: [0.01, 0.04, 0.11, 0.21, 0.26, 0.21, 0.11, 0.04, 0.01]
         ↑                  ↑                          ↑
       edges            center                     edges

Notice how the Gaussian naturally emphasizes the center and smoothly tapers toward the edges. This produces more natural-looking results!

Interactive visualization

Coming soon! An interactive tool where you can:

Adjust σ and see the Gaussian kernel change shape
Compare Gaussian vs moving average side-by-side
See how σ affects smoothing strength
Visualize the weight distribution

Code examples

import numpy as np
from scipy.ndimage import gaussian_filter1d
import matplotlib.pyplot as plt

# Generate noisy spectrum
np.random.seed(42)
x = np.linspace(0, 100, 500)
true_signal = (0.8 * np.exp(-((x - 25)**2) / 20) +
               0.6 * np.exp(-((x - 50)**2) / 15) +
               0.9 * np.exp(-((x - 75)**2) / 25))
noise = np.random.normal(0, 0.08, len(x))
noisy_signal = true_signal + noise

# Apply Gaussian smoothing with different sigma values
# scipy.ndimage.gaussian_filter1d(data, sigma)
gauss_sigma1 = gaussian_filter1d(noisy_signal, sigma=1.0)
gauss_sigma2 = gaussian_filter1d(noisy_signal, sigma=2.0)
gauss_sigma3 = gaussian_filter1d(noisy_signal, sigma=3.0)
gauss_sigma5 = gaussian_filter1d(noisy_signal, sigma=5.0)

# Plot results
plt.figure(figsize=(12, 8))

plt.subplot(2, 2, 1)
plt.plot(x, noisy_signal, alpha=0.5, label='Noisy', color='gray')
plt.plot(x, gauss_sigma1, label='σ=1.0', linewidth=2)
plt.plot(x, true_signal, '--', label='True', color='black', alpha=0.7)
plt.title('Light smoothing (σ=1.0)')
plt.legend()
plt.grid(True, alpha=0.3)

plt.subplot(2, 2, 2)
plt.plot(x, noisy_signal, alpha=0.5, label='Noisy', color='gray')
plt.plot(x, gauss_sigma2, label='σ=2.0', linewidth=2)
plt.plot(x, true_signal, '--', label='True', color='black', alpha=0.7)
plt.title('Moderate smoothing (σ=2.0)')
plt.legend()
plt.grid(True, alpha=0.3)

plt.subplot(2, 2, 3)
plt.plot(x, noisy_signal, alpha=0.5, label='Noisy', color='gray')
plt.plot(x, gauss_sigma3, label='σ=3.0', linewidth=2)
plt.plot(x, true_signal, '--', label='True', color='black', alpha=0.7)
plt.title('Heavy smoothing (σ=3.0)')
plt.legend()
plt.grid(True, alpha=0.3)

plt.subplot(2, 2, 4)
plt.plot(x, noisy_signal, alpha=0.5, label='Noisy', color='gray')
plt.plot(x, gauss_sigma5, label='σ=5.0', linewidth=2)
plt.plot(x, true_signal, '--', label='True', color='black', alpha=0.7)
plt.title('Very heavy smoothing (σ=5.0)')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Visualize the Gaussian kernel
def gaussian_kernel(sigma, size):
    """Create a Gaussian kernel"""
    x = np.arange(-size//2 + 1, size//2 + 1)
    kernel = np.exp(-x**2 / (2 * sigma**2))
    return kernel / kernel.sum()  # Normalize

# Show kernels for different sigmas
plt.figure(figsize=(10, 6))
for sigma in [0.5, 1.0, 2.0, 3.0]:
    kernel = gaussian_kernel(sigma, size=21)
    plt.plot(kernel, marker='o', label=f'σ={sigma}', linewidth=2)

plt.xlabel('Position relative to center')
plt.ylabel('Weight')
plt.title('Gaussian Kernels for Different σ Values')
plt.legend()
plt.grid(True, alpha=0.3)
plt.axhline(y=0, color='black', linewidth=0.5)
plt.show()

# Compare with moving average
window_size = 11
moving_avg = np.convolve(noisy_signal,
                         np.ones(window_size)/window_size,
                         mode='same')
gauss_equiv = gaussian_filter1d(noisy_signal, sigma=2.0)

plt.figure(figsize=(10, 6))
plt.plot(x, noisy_signal, alpha=0.4, label='Noisy', color='gray')
plt.plot(x, moving_avg, label=f'Moving Average (w={window_size})', linewidth=2)
plt.plot(x, gauss_equiv, label='Gaussian (σ=2.0)', linewidth=2, linestyle='--')
plt.plot(x, true_signal, 'k--', label='True Signal', alpha=0.7, linewidth=1.5)
plt.xlabel('Data Point')
plt.ylabel('Signal Value')
plt.title('Gaussian vs Moving Average')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

% Gaussian smoothing in MATLAB
% Using imgaussfilt or manual implementation

% Generate noisy spectrum
rng(42);
x = linspace(0, 100, 500);
true_signal = 0.8 * exp(-((x - 25).^2) / 20) + ...
              0.6 * exp(-((x - 50).^2) / 15) + ...
              0.9 * exp(-((x - 75).^2) / 25);
noise = 0.08 * randn(size(x));
noisy_signal = true_signal + noise;

% Apply Gaussian smoothing with different sigma values
% imgaussfilt(data, sigma) - from Image Processing Toolbox
% OR use smoothdata(data, 'gaussian', window) - from base MATLAB

% Method 1: Using imgaussfilt (if available)
try
    gauss_sigma1 = imgaussfilt(noisy_signal, 1.0);
    gauss_sigma2 = imgaussfilt(noisy_signal, 2.0);
    gauss_sigma3 = imgaussfilt(noisy_signal, 3.0);
    gauss_sigma5 = imgaussfilt(noisy_signal, 5.0);
catch
    % Method 2: Manual implementation
    gauss_smooth = @(data, sigma) conv(data, gausswin(round(6*sigma)), 'same') / sum(gausswin(round(6*sigma)));
    gauss_sigma1 = gauss_smooth(noisy_signal, 1.0);
    gauss_sigma2 = gauss_smooth(noisy_signal, 2.0);
    gauss_sigma3 = gauss_smooth(noisy_signal, 3.0);
    gauss_sigma5 = gauss_smooth(noisy_signal, 5.0);
end

% Plot results
figure('Position', [100 100 1200 800]);

subplot(2, 2, 1);
plot(x, noisy_signal, 'Color', [0.5 0.5 0.5 0.5], 'DisplayName', 'Noisy');
hold on;
plot(x, gauss_sigma1, 'LineWidth', 2, 'DisplayName', '\sigma=1.0');
plot(x, true_signal, '--k', 'LineWidth', 1.5, 'DisplayName', 'True');
title('Light smoothing (\sigma=1.0)');
legend('Location', 'best');
grid on;

subplot(2, 2, 2);
plot(x, noisy_signal, 'Color', [0.5 0.5 0.5 0.5], 'DisplayName', 'Noisy');
hold on;
plot(x, gauss_sigma2, 'LineWidth', 2, 'DisplayName', '\sigma=2.0');
plot(x, true_signal, '--k', 'LineWidth', 1.5, 'DisplayName', 'True');
title('Moderate smoothing (\sigma=2.0)');
legend('Location', 'best');
grid on;

subplot(2, 2, 3);
plot(x, noisy_signal, 'Color', [0.5 0.5 0.5 0.5], 'DisplayName', 'Noisy');
hold on;
plot(x, gauss_sigma3, 'LineWidth', 2, 'DisplayName', '\sigma=3.0');
plot(x, true_signal, '--k', 'LineWidth', 1.5, 'DisplayName', 'True');
title('Heavy smoothing (\sigma=3.0)');
legend('Location', 'best');
grid on;

subplot(2, 2, 4);
plot(x, noisy_signal, 'Color', [0.5 0.5 0.5 0.5], 'DisplayName', 'Noisy');
hold on;
plot(x, gauss_sigma5, 'LineWidth', 2, 'DisplayName', '\sigma=5.0');
plot(x, true_signal, '--k', 'LineWidth', 1.5, 'DisplayName', 'True');
title('Very heavy smoothing (\sigma=5.0)');
legend('Location', 'best');
grid on;

% Visualize the Gaussian kernel
figure;
hold on;
for sigma = [0.5, 1.0, 2.0, 3.0]
    kernel = gausswin(round(6*sigma));
    kernel = kernel / sum(kernel);  % Normalize
    plot(kernel, '-o', 'LineWidth', 2, 'DisplayName', ['\sigma=' num2str(sigma)]);
end
xlabel('Position relative to center');
ylabel('Weight');
title('Gaussian Kernels for Different \sigma Values');
legend('Location', 'best');
grid on;

% Compare with moving average
window_size = 11;
moving_avg = conv(noisy_signal, ones(1, window_size)/window_size, 'same');
try
    gauss_equiv = imgaussfilt(noisy_signal, 2.0);
catch
    gauss_equiv = conv(noisy_signal, gausswin(12), 'same') / sum(gausswin(12));
end

figure;
plot(x, noisy_signal, 'Color', [0.5 0.5 0.5 0.4], 'DisplayName', 'Noisy');
hold on;
plot(x, moving_avg, 'LineWidth', 2, 'DisplayName', ['Moving Average (w=' num2str(window_size) ')']);
plot(x, gauss_equiv, '--', 'LineWidth', 2, 'DisplayName', 'Gaussian (\sigma=2.0)');
plot(x, true_signal, '--k', 'LineWidth', 1.5, 'DisplayName', 'True Signal');
xlabel('Data Point');
ylabel('Signal Value');
title('Gaussian vs Moving Average');
legend('Location', 'best');
grid on;

# Gaussian smoothing in R
# Using custom function or signal package

# Function to create Gaussian kernel
gaussian_kernel <- function(sigma, size = NULL) {
  if (is.null(size)) {
    size <- ceiling(6 * sigma)
    if (size %% 2 == 0) size <- size + 1  # Make it odd
  }
  x <- seq(-(size - 1)/2, (size - 1)/2, by = 1)
  kernel <- exp(-x^2 / (2 * sigma^2))
  kernel <- kernel / sum(kernel)  # Normalize
  return(kernel)
}

# Gaussian smoothing function
gaussian_smooth <- function(data, sigma) {
  kernel <- gaussian_kernel(sigma)
  smoothed <- stats::filter(data, kernel, sides = 2)
  # Handle NAs at edges
  smoothed[is.na(smoothed)] <- data[is.na(smoothed)]
  return(as.vector(smoothed))
}

# Generate noisy spectrum
set.seed(42)
x <- seq(0, 100, length.out = 500)
true_signal <- 0.8 * exp(-((x - 25)^2) / 20) +
               0.6 * exp(-((x - 50)^2) / 15) +
               0.9 * exp(-((x - 75)^2) / 25)
noise <- rnorm(length(x), mean = 0, sd = 0.08)
noisy_signal <- true_signal + noise

# Apply Gaussian smoothing with different sigma values
gauss_sigma1 <- gaussian_smooth(noisy_signal, sigma = 1.0)
gauss_sigma2 <- gaussian_smooth(noisy_signal, sigma = 2.0)
gauss_sigma3 <- gaussian_smooth(noisy_signal, sigma = 3.0)
gauss_sigma5 <- gaussian_smooth(noisy_signal, sigma = 5.0)

# Plot results
par(mfrow = c(2, 2))

# Plot 1
plot(x, noisy_signal, type = 'l', col = 'gray',
     main = 'Light smoothing (σ=1.0)',
     xlab = 'Data Point', ylab = 'Signal')
lines(x, gauss_sigma1, col = 'blue', lwd = 2)
lines(x, true_signal, col = 'black', lty = 2, lwd = 1.5)
legend('topright', legend = c('Noisy', 'σ=1.0', 'True'),
       col = c('gray', 'blue', 'black'), lwd = c(1, 2, 1.5),
       lty = c(1, 1, 2))
grid()

# Plot 2
plot(x, noisy_signal, type = 'l', col = 'gray',
     main = 'Moderate smoothing (σ=2.0)',
     xlab = 'Data Point', ylab = 'Signal')
lines(x, gauss_sigma2, col = 'red', lwd = 2)
lines(x, true_signal, col = 'black', lty = 2, lwd = 1.5)
legend('topright', legend = c('Noisy', 'σ=2.0', 'True'),
       col = c('gray', 'red', 'black'), lwd = c(1, 2, 1.5),
       lty = c(1, 1, 2))
grid()

# Plot 3
plot(x, noisy_signal, type = 'l', col = 'gray',
     main = 'Heavy smoothing (σ=3.0)',
     xlab = 'Data Point', ylab = 'Signal')
lines(x, gauss_sigma3, col = 'green', lwd = 2)
lines(x, true_signal, col = 'black', lty = 2, lwd = 1.5)
legend('topright', legend = c('Noisy', 'σ=3.0', 'True'),
       col = c('gray', 'green', 'black'), lwd = c(1, 2, 1.5),
       lty = c(1, 1, 2))
grid()

# Plot 4
plot(x, noisy_signal, type = 'l', col = 'gray',
     main = 'Very heavy smoothing (σ=5.0)',
     xlab = 'Data Point', ylab = 'Signal')
lines(x, gauss_sigma5, col = 'purple', lwd = 2)
lines(x, true_signal, col = 'black', lty = 2, lwd = 1.5)
legend('topright', legend = c('Noisy', 'σ=5.0', 'True'),
       col = c('gray', 'purple', 'black'), lwd = c(1, 2, 1.5),
       lty = c(1, 1, 2))
grid()

# Visualize the Gaussian kernel
dev.new()
plot(NULL, xlim = c(-10, 10), ylim = c(0, 0.5),
     xlab = 'Position relative to center', ylab = 'Weight',
     main = 'Gaussian Kernels for Different σ Values')
colors <- c('blue', 'red', 'green', 'purple')
for (i in 1:4) {
  sigma <- c(0.5, 1.0, 2.0, 3.0)[i]
  kernel <- gaussian_kernel(sigma, size = 21)
  x_kernel <- seq(-10, 10, length.out = length(kernel))
  lines(x_kernel, kernel, col = colors[i], lwd = 2, type = 'b', pch = 19)
}
legend('topright', legend = c('σ=0.5', 'σ=1.0', 'σ=2.0', 'σ=3.0'),
       col = colors, lwd = 2, pch = 19)
grid()
abline(h = 0, lty = 1, lwd = 0.5)

# Compare with moving average
dev.new()
window_size <- 11
moving_avg_kernel <- rep(1/window_size, window_size)
moving_avg <- stats::filter(noisy_signal, moving_avg_kernel, sides = 2)
moving_avg[is.na(moving_avg)] <- noisy_signal[is.na(moving_avg)]

gauss_equiv <- gaussian_smooth(noisy_signal, sigma = 2.0)

plot(x, noisy_signal, type = 'l', col = rgb(0.5, 0.5, 0.5, 0.4),
     xlab = 'Data Point', ylab = 'Signal Value',
     main = 'Gaussian vs Moving Average')
lines(x, moving_avg, col = 'blue', lwd = 2)
lines(x, gauss_equiv, col = 'red', lwd = 2, lty = 2)
lines(x, true_signal, col = 'black', lwd = 1.5, lty = 2)
legend('topright',
       legend = c('Noisy', 'Moving Average (w=11)', 'Gaussian (σ=2.0)', 'True Signal'),
       col = c(rgb(0.5, 0.5, 0.5), 'blue', 'red', 'black'),
       lwd = c(1, 2, 2, 1.5), lty = c(1, 1, 2, 2))
grid()

Advantages of Gaussian smoothing

✅ Natural weighting — Closer points matter more, which makes intuitive sense ✅ Smooth results — No blocky artifacts like moving average ✅ Mathematically elegant — Based on the Gaussian function, well-understood ✅ One parameter — Only σ to tune (simpler than Savitzky-Golay) ✅ No negative weights — Unlike some SG configurations ✅ Isotropic — Treats all directions equally (important for 2D/3D data) ✅ Fast implementation — Can be computed efficiently with FFT

Disadvantages of Gaussian smoothing

❌ Still broadens peaks — Better than moving average, but not as good as Savitzky-Golay ❌ No derivative capability — Can't compute derivatives while smoothing ❌ Parameter interpretation — σ is less intuitive than "window size" ❌ Infinite support — Gaussian never truly reaches zero (but practically does beyond 3σ)

When to use Gaussian smoothing

Gaussian smoothing is a good choice when:

✅ You want weighted averaging with smooth falloff ✅ You need natural-looking smooth curves ✅ You're working with images or 2D data (extends naturally) ✅ You want to avoid blocky artifacts from moving average ✅ You don't need derivative calculations ✅ Simplicity is valued (one parameter)

Avoid Gaussian when: ❌ You need to preserve very sharp peaks (use Savitzky-Golay) ❌ You need derivatives (use Savitzky-Golay) ❌ You need maximum control (use Whittaker)

Choosing sigma: practical guidelines

Quick start values

Noise Level	Recommended σ
Low noise	σ = 0.5 to 1.0
Moderate noise	σ = 1.5 to 2.5
High noise	σ = 3.0 to 5.0

The effective window rule

Remember: effective window ≈ 6σ to 7σ

So if you know your desired window size:

Want window ≈ 10? Use σ ≈ 1.5
Want window ≈ 15? Use σ ≈ 2.5
Want window ≈ 20? Use σ ≈ 3.0

Validation

Visual check: Smooth but not over-smoothed?
Residuals: (original - smoothed) should look like noise
Feature preservation: Are peaks still where they should be?

Comparison with other methods

Method	Weighting	Parameters	Peak Preservation	Best For
Moving Average	Equal	1 (window)	Poor	Quick exploration
Gaussian	Distance-based	1 (σ)	Moderate	Natural smoothing
Savitzky-Golay	Polynomial fit	2 (window, order)	Excellent	Spectroscopy
Whittaker	Optimization	1-2 (λ, order)	Excellent	General use

Gaussian's niche: When you want better than moving average but simpler than Savitzky-Golay!

Extensions: multidimensional Gaussian smoothing

One beautiful feature of Gaussian smoothing: it extends naturally to 2D and 3D data!

For 2D data (like images or spectral-spatial data):

w (x, y) = \frac{1}{2 π σ ^{2}} e^{- \frac{x ^{2} + y ^{2}}{2 σ ^{2}}}

This makes Gaussian smoothing particularly popular in:

Image processing
Hyperspectral imaging
Spatial-spectral data
Computer vision

The same σ parameter controls smoothing in all directions simultaneously!

Common mistakes to avoid

❌ Confusing σ with window size → Remember: window ≈ 6σ ❌ Using σ too large → Over-smoothing destroys features ❌ Not normalizing the kernel → Results will have wrong amplitude ❌ Ignoring edge effects → Edges need special treatment ❌ Expecting SG-level peak preservation → Gaussian is good but not optimal for peaks

The math behind the Gaussian

For those interested in why the Gaussian function is special:

The Gaussian is the unique function that:

Minimizes uncertainty (Heisenberg uncertainty principle)
Is its own Fourier transform (same shape in frequency domain)
Is separable (2D Gaussian = 1D Gaussian × 1D Gaussian)
Maximizes entropy (for a given variance)

These properties make it a natural choice for smoothing. The Gaussian is, in many senses, the "most uniform" way to spread out information!