Skip to content

Savitzky-Golay Smoothing

In 1964, two scientists at the Perkin-Elmer Corporation — one of the leading analytical instrument manufacturers of the era — published a paper that would become one of the most cited in the history of analytical chemistry. Abraham Savitzky, a physicist and computer scientist, and Marcel J.E. Golay, a Swiss-born physicist and information theorist, authored “Smoothing and Differentiation of Data by Simplified Least Squares Procedures” in Analytical Chemistry (volume 36, pages 1627—1639). Their timing was not accidental. By the mid-1960s, analytical instruments were producing digital output that could be fed directly into the early minicomputers appearing in laboratories, and practitioners urgently needed practical algorithms for cleaning up noisy spectra without distorting the peaks that carried chemical information.

What made the Savitzky-Golay method revolutionary was its core insight: instead of crudely averaging neighboring points (as a moving average does), you could fit a local polynomial through a sliding window of data and read off the smoothed value from the polynomial at the center point. This preserved peak shapes, heights, and widths far better than simple averaging. Better still, Savitzky and Golay showed that the entire operation could be reduced to a set of precomputed convolution coefficients — integer weights that could be looked up in a table and applied with minimal computation, a critical advantage on the limited hardware of the 1960s.

The original paper did contain errors in some of the published coefficient tables, a fact corrected by Steinier, Termonia, and Deltour in 1972 and later by others. But neither the errata nor the passage of six decades have diminished the method’s dominance: Savitzky-Golay smoothing remains the standard preprocessing tool in spectroscopy, chromatography, and signal processing whenever peak fidelity matters.

The peak preservation problem

You’ve learned about moving average smoothing and Whittaker smoothing. Both methods reduce noise effectively, but they face a common challenge: how do you smooth data without destroying the peaks you care about?

Moving average has a particularly bad habit: it broadens and flattens peaks. If you have two nearby peaks that are barely resolved, moving average can blur them together into one blob. This is a disaster for spectroscopy, where peak positions, heights, and widths carry crucial chemical information!

Whittaker smoothing does better, but requires matrix operations and parameter tuning. Wouldn’t it be nice to have something that:

  • Preserves peak shapes like Whittaker
  • Is as simple to understand as moving average
  • Has a solid mathematical foundation
  • Can even compute derivatives while smoothing

Enter Savitzky-Golay smoothing — the method that does all of this and more!

The big idea: local polynomial fitting

The Savitzky-Golay (SG) method takes a completely different approach from simple averaging. Instead of just averaging neighbors, it:

Fits a polynomial to each local window, then uses that polynomial to compute the smoothed value.

Let’s break this down with an example. Suppose you’re at point i and want to smooth it using a window of 5 points (points i-2, i-1, i, i+1, i+2):

Moving average says:

  • “Just average these 5 values: (y_{i-2} + y_{i-1} + y_i + y_{i+1} + y_{i+2}) / 5
  • Simple, but treats all points equally and ignores the local trend

Savitzky-Golay says:

  • “Fit a polynomial (like y = a + bx + cx²) through these 5 points”
  • “Evaluate that polynomial at the center point”
  • “Use that as your smoothed value”

The polynomial captures the local trend, so if you’re on a rising slope, the smooth value reflects that slope. If you’re at a peak, the polynomial captures the curvature. This is why SG preserves features so well!

A visual analogy

Imagine you’re hiking in fog and trying to figure out the terrain:

  • Moving average: You ask 5 people around you their altitude and take the average. If you’re on a hill, this average might be lower than your actual position!

  • Savitzky-Golay: You ask those 5 people their altitudes, fit a smooth curve through all the points, and read off the curve at your position. This captures whether you’re going uphill, downhill, or at a peak.

The mathematics (simplified)

Don’t worry — you don’t need to fit polynomials manually! The beauty of SG is that it can be computed as a weighted average, just like moving average, but with special weights.

For a window of size w and polynomial order p, there exists a set of coefficients c₋ₘ, …, c₀, …, cₘ such that:

where m = (w-1)/2.

The key difference from moving average: These coefficients aren’t all equal (like 1/w). Instead, they’re computed to give the best polynomial fit. The math behind computing these coefficients involves least squares (sound familiar?), but the coefficients are precomputed and tabulated.

You don’t need to compute them yourself — just look them up or use a library!

Example coefficients

For a window size of 5 and polynomial order 2 (quadratic), the SG coefficients are:

So the smoothed value is:

Notice:

  • The center point gets the highest weight (17/35 ≈ 0.49)
  • Immediate neighbors get good weight (12/35 ≈ 0.34 each)
  • Edge points actually get negative weight (-3/35)!

This pattern captures the local curvature and is why SG preserves peaks so well.

Parameters: window size and polynomial order

SG smoothing has two parameters to tune:

1. Window size (w)

Just like moving average, the window size controls how many neighbors you use:

  • Small window (5-7 points): Light smoothing, preserves fine details
  • Medium window (9-15 points): Moderate smoothing, good for typical spectra
  • Large window (17-25 points): Heavy smoothing, for very noisy data

Rule of thumb: The window should be larger than the noise fluctuations but smaller than your narrowest feature.

2. Polynomial order (p)

This is new! It controls the flexibility of the fitted polynomial:

  • Order 2 (quadratic): Most common, captures curvature around peaks
  • Order 3 (cubic): More flexible, can capture asymmetric peaks
  • Order 4 (quartic): Very flexible, but can overfit to noise
  • Order 5+: Rarely needed, high risk of overfitting

Rule of thumb:

  • Start with order 2 or 3
  • Higher orders need larger windows (otherwise you’re fitting 5 parameters with 5 points = overfitting!)
  • Typical constraint: w ≥ p + 2 (preferably w ≥ 2p + 1)

Interactive visualization

Coming soon! An interactive tool where you can:

  • Adjust window size and polynomial order
  • Compare SG with moving average side-by-side
  • See how SG preserves peak heights and widths
  • Visualize the fitted polynomial in each window

Code examples

import numpy as np
from scipy.signal import savgol_filter
import matplotlib.pyplot as plt
# Generate noisy spectrum with sharp peaks
np.random.seed(42)
x = np.linspace(0, 100, 500)
true_signal = (0.8 * np.exp(-((x - 25)**2) / 20) +
0.6 * np.exp(-((x - 50)**2) / 15) +
0.9 * np.exp(-((x - 75)**2) / 25))
noise = np.random.normal(0, 0.08, len(x))
noisy_signal = true_signal + noise
# Apply Savitzky-Golay smoothing
# scipy.signal.savgol_filter(data, window_length, polyorder)
sg_smooth = savgol_filter(noisy_signal, window_length=11, polyorder=2)
# Compare different parameters
sg_w5_p2 = savgol_filter(noisy_signal, window_length=5, polyorder=2)
sg_w11_p2 = savgol_filter(noisy_signal, window_length=11, polyorder=2)
sg_w11_p3 = savgol_filter(noisy_signal, window_length=11, polyorder=3)
sg_w21_p2 = savgol_filter(noisy_signal, window_length=21, polyorder=2)
# Plot results
plt.figure(figsize=(12, 8))
plt.subplot(2, 2, 1)
plt.plot(x, noisy_signal, alpha=0.5, label='Noisy', color='gray')
plt.plot(x, sg_w5_p2, label='w=5, p=2', linewidth=2)
plt.plot(x, true_signal, '--', label='True', color='black', alpha=0.7)
plt.title('Light smoothing (w=5, p=2)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.subplot(2, 2, 2)
plt.plot(x, noisy_signal, alpha=0.5, label='Noisy', color='gray')
plt.plot(x, sg_w11_p2, label='w=11, p=2', linewidth=2)
plt.plot(x, true_signal, '--', label='True', color='black', alpha=0.7)
plt.title('Moderate smoothing (w=11, p=2)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.subplot(2, 2, 3)
plt.plot(x, noisy_signal, alpha=0.5, label='Noisy', color='gray')
plt.plot(x, sg_w11_p3, label='w=11, p=3', linewidth=2)
plt.plot(x, true_signal, '--', label='True', color='black', alpha=0.7)
plt.title('Higher polynomial (w=11, p=3)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.subplot(2, 2, 4)
plt.plot(x, noisy_signal, alpha=0.5, label='Noisy', color='gray')
plt.plot(x, sg_w21_p2, label='w=21, p=2', linewidth=2)
plt.plot(x, true_signal, '--', label='True', color='black', alpha=0.7)
plt.title('Heavy smoothing (w=21, p=2)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Example: Computing derivatives while smoothing
# The deriv parameter computes the derivative of the polynomial
first_derivative = savgol_filter(noisy_signal, window_length=11, polyorder=2, deriv=1)
second_derivative = savgol_filter(noisy_signal, window_length=11, polyorder=2, deriv=2)
plt.figure(figsize=(10, 6))
plt.subplot(3, 1, 1)
plt.plot(x, noisy_signal, alpha=0.5, color='gray')
plt.plot(x, sg_smooth, linewidth=2)
plt.ylabel('Signal')
plt.title('Savitzky-Golay: Smoothing and Derivatives')
plt.grid(True, alpha=0.3)
plt.subplot(3, 1, 2)
plt.plot(x, first_derivative, linewidth=2, color='red')
plt.ylabel('1st Derivative')
plt.grid(True, alpha=0.3)
plt.subplot(3, 1, 3)
plt.plot(x, second_derivative, linewidth=2, color='green')
plt.ylabel('2nd Derivative')
plt.xlabel('Data Point')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

The derivative bonus: smooth and differentiate simultaneously

Here’s where SG becomes truly powerful: it can compute derivatives while smoothing!

Remember from calculus that differentiating amplifies noise. If you differentiate raw noisy data, you get garbage. The traditional approach is:

  1. Smooth the data first
  2. Then compute numerical derivatives

But SG can do both in one step! Since it fits a polynomial to each window, it can:

  • Evaluate the polynomial → gives smoothed value
  • Evaluate the polynomial’s derivative → gives smoothed derivative
  • Evaluate the second derivative → gives smoothed second derivative

This is huge for spectroscopy where derivatives enhance peak resolution. We’ll cover derivatives in detail in a later article, but know that SG is the standard tool for this.

Advantages of Savitzky-Golay

Excellent peak preservation — Minimal broadening compared to moving average ✅ Mathematically rigorous — Based on local polynomial least squares ✅ Flexible — Two parameters (window, polynomial order) allow fine-tuning ✅ Derivative capability — Can smooth and differentiate simultaneously ✅ Well-established — Widely used and trusted in spectroscopy ✅ Precomputed coefficients — Fast implementation (just a convolution) ✅ Predictable behavior — Clear relationship between parameters and smoothing

Disadvantages of Savitzky-Golay

Two parameters to tune — More complex than single-parameter methods ❌ Can overfit — High polynomial orders with small windows capture noise ❌ Edge effects — Needs special handling at data boundaries ❌ Not optimal for all signals — Assumes locally polynomial behavior ❌ Less intuitive — Harder to explain than “averaging neighbors”

When to use Savitzky-Golay

Savitzky-Golay is the go-to method for spectroscopy when:

✅ You have sharp peaks that must be preserved ✅ You need to compute derivatives (it’s the best choice!) ✅ You want a balance between simplicity and quality ✅ You’re working with typical spectroscopic data (IR, Raman, UV-Vis, etc.) ✅ You want a method that’s widely recognized and accepted

Avoid SG when: ❌ Your signal isn’t smooth/continuous (use median filter instead) ❌ You have extreme noise (try Whittaker with high λ) ❌ You need the absolute simplest method (use moving average)

Comparison with other methods

FeatureMoving AverageSavitzky-GolayWhittaker
Peak preservationPoorExcellentExcellent
Number of parameters1 (window)2 (window, order)1-2 (λ, order)
ComputationVery fastFastFast
DerivativesNoYes!Not directly
Ease of useEasiestModerateModerate
Best forQuick explorationSpectroscopyGeneral smoothing

Choosing parameters: practical guidelines

Step 1: Choose polynomial order

Start with p = 2 (quadratic). This works for 90% of cases.

Use p = 3 if:

  • Your peaks are asymmetric
  • You need maximum flexibility
  • You’re using a large window (≥15)

Avoid p ≥ 4 unless you really know what you’re doing!

Step 2: Choose window size

Start with w = 11 for typical spectroscopic data.

Adjust based on:

  • More noise → increase window (15, 21, 25)
  • Sharper peaks → decrease window (7, 9)
  • Check the constraint: w ≥ 2p + 1 for safety

Step 3: Validate

  1. Visual check: Does it look smooth without destroying peaks?
  2. Residuals: Plot (original - smoothed). Should look like pure noise.
  3. Peak metrics: Compare peak heights and widths before/after.
  4. Cross-validation: If building a model, does smoothing improve prediction?

Quick reference table

Noise LevelRecommended Settings
Low noisew = 7-9, p = 2
Moderate noisew = 11-15, p = 2
High noisew = 17-25, p = 2-3
Very sharp peaksw = 5-7, p = 2
Broad peaksw = 15-21, p = 2-3

Common mistakes to avoid

Using polynomial order too high → Overfitting, captures noise as signal ❌ Window too small for the polynomial → Violates w ≥ 2p + 1 rule ❌ Window too large for features → Broadens peaks unnecessarily ❌ Not checking edge effects → Edges can behave strangely ❌ Smoothing already-smoothed data → Diminishing returns, feature loss ❌ Different parameters for training/test → Inconsistent preprocessing

Further reading

Original paper:

  • Savitzky, A., & Golay, M. J. E. (1964). “Smoothing and Differentiation of Data by Simplified Least Squares Procedures”. Analytical Chemistry, 36(8), 1627-1639.

Corrections to original tables:

  • Steinier, J., Termonia, Y., & Deltour, J. (1972). “Smoothing and differentiation of data by simplified least square procedure”. Analytical Chemistry, 44(11), 1906-1909.

Modern treatments:

  • Press, W. H., et al. (2007). Numerical Recipes: The Art of Scientific Computing (3rd ed.). Chapter on “Savitzky-Golay Smoothing Filters”.
  • Schafer, R. W. (2011). “What Is a Savitzky-Golay Filter?”. IEEE Signal Processing Magazine, 28(4), 111-117.

Next steps: