The concept of signal-to-noise ratio (SNR) migrated into analytical chemistry from electrical engineering, where it was formalized during the 1940s and 1950s as radar, telecommunications, and early computing demanded rigorous frameworks for separating meaningful signals from background interference. Claude Shannon’s landmark 1948 paper “A Mathematical Theory of Communication” established the theoretical limits of information transmission through noisy channels, and Norbert Wiener’s wartime work on optimal filtering (published as Extrapolation, Interpolation, and Smoothing of Stationary Time Series in 1949) provided the mathematical tools for extracting signals buried in noise. These ideas did not stay confined to engineering for long. By the 1950s and 1960s, spectroscopists were adopting the same language and mathematics to quantify the performance of their instruments and the quality of their measurements.
The digital revolution of the 1960s and 1970s transformed noise reduction from an analog art into a computational science. Before digitization, smoothing meant physical tricks: slowing the scan speed of a spectrometer, increasing the slit width, or using analog RC filters on the detector output. Each of these improved the SNR at the cost of spectral resolution or measurement time. When laboratory instruments began producing digital output — discrete numerical values at evenly spaced wavelength or frequency intervals — the entire toolkit of digital signal processing (DSP) became available. Suddenly, smoothing could be applied after acquisition, without any compromise to how the data was collected. Savitzky and Golay’s 1964 paper on polynomial smoothing filters, published in Analytical Chemistry, was one of the first to exploit this new paradigm directly for spectroscopic data, and it remains one of the most cited papers in the history of the field.
In modern spectroscopy, noise is not a single phenomenon but a family of effects with distinct physical origins. Shot noise arises from the quantum nature of photon detection — photons arrive at the detector as discrete events following Poisson statistics, creating statistical fluctuations that scale with the square root of the signal. Thermal noise (Johnson-Nyquist noise) originates from random electron motion in the detector and electronics, present even in the absence of any optical signal. Dark current generates a signal in photodetectors even when no light is incident, due to thermally generated charge carriers. Digitization noise appears when the analog detector signal is converted to digital numbers, introducing rounding errors proportional to the resolution of the analog-to-digital converter. Understanding which noise source dominates your measurement is the first step toward choosing an effective reduction strategy.
What is noise?
In the context of spectroscopy and chemometrics, noise is any unwanted random fluctuation superimposed on the true signal. If you measure the same sample ten times under identical conditions, you will get ten slightly different spectra. The variation between those repeated measurements is noise.
The quality of a measurement is quantified by the signal-to-noise ratio (SNR):
SNR=sxˉ
where xˉ is the mean signal intensity and s is the standard deviation of repeated measurements at that point. A higher SNR means a cleaner measurement.
In practice, SNR is often expressed in decibels:
SNRdB=20log10(sxˉ)
Why noise matters for chemometrics
Noise is not just an aesthetic problem. It has concrete consequences for multivariate analysis:
Calibration models fit noise as if it were signal (overfitting), reducing prediction accuracy on new samples
Classification models may find spurious differences between groups that are actually just noise fluctuations
Peak detection algorithms produce false positives (noise spikes mistaken for peaks) and false negatives (real peaks buried in noise)
Derivative spectra amplify noise dramatically — the first derivative doubles the noise level, the second derivative amplifies it further
Variable selection methods may select noisy, uninformative variables
The goal of noise reduction is to suppress these random fluctuations while preserving the chemical information encoded in the spectral features (peak positions, heights, widths, and shapes).
Types of noise in spectroscopy
Not all noise is the same. Different physical mechanisms produce different noise characteristics, and the distinction matters for choosing the right reduction strategy.
Random noise (additive, homoscedastic)
The simplest and most common noise model: a random value, drawn from a Gaussian distribution with zero mean, is added independently to each data point.
yi=xi+εi,εi∼N(0,σ2)
Here xi is the true signal at point i , εi is the noise, and σ is constant across the spectrum. This is called homoscedastic noise because its magnitude does not depend on the signal level.
Thermal noise in electronic circuits is the prototypical example. It is present at all wavelengths with roughly equal intensity, independent of the signal. Most smoothing methods are designed and analyzed assuming this noise model, and they work well when it holds.
Shot noise (signal-dependent)
Shot noise arises from the discrete nature of photon counting. Because photons arrive randomly, the number detected in a given time interval follows a Poisson distribution, whose variance equals the mean count. For a signal of intensity I :
σshot=I
This means stronger signals have more absolute noise, but their relative noise (as a fraction of the signal) is lower. In UV-Vis or fluorescence spectroscopy where photon counts are moderate, shot noise is often the dominant noise source. Its signal-dependent character means that smoothing methods, which assume constant noise, may over-smooth low-intensity regions and under-smooth high-intensity regions.
Systematic noise
Not all unwanted variation is random. Systematic noise includes:
Baseline drift: A slow, smooth variation of the entire spectrum due to temperature changes, lamp aging, or detector drift. This is not random noise — it is a deterministic trend that changes slowly over time. Baseline drift is better addressed by baseline correction than by smoothing.
Interference fringes: Periodic oscillations caused by thin-film interference in sample cells or optical elements. These appear as a sinusoidal modulation of the spectrum and cannot be removed by simple smoothing. Fourier filtering or dedicated fringe-removal algorithms are needed.
Spikes and cosmic rays: In Raman spectroscopy and CCD-based detectors, cosmic rays occasionally strike the detector, producing extremely sharp, intense spikes that span one or two pixels. These are not Gaussian noise and should be removed by spike detection algorithms (e.g., median filtering or derivative-based detection) before smoothing.
Multiplicative noise
In diffuse reflectance spectroscopy (NIR, DRIFTS), scattering effects cause the entire spectrum to be multiplied by a random factor that varies from sample to sample. This is not additive noise at all — it is a multiplicative distortion:
When the noise level changes across the spectrum — for example, higher noise in regions of low detector sensitivity or near the edges of the spectral range — the noise is called heteroscedastic. This is common in real instruments. Standard smoothing methods apply the same degree of smoothing everywhere, which may be too much in low-noise regions and too little in high-noise regions.
The smoothing approach
Smoothing methods exploit a simple asymmetry between signal and noise: spectroscopic signals tend to change smoothly from one wavelength to the next (because they arise from continuous physical phenomena like molecular vibrations or electronic transitions), while random noise fluctuates independently at each point. By averaging or fitting over a local window of data points, the random fluctuations tend to cancel out while the smooth signal survives.
The four main smoothing methods used in chemometrics, from simplest to most sophisticated, are summarized below. Each has a dedicated article with full mathematical derivations, interactive visualizations, and code examples.
Moving average
The simplest smoother: replace each point with the unweighted average of itself and its neighbors.
y^i=w1j=i−m∑i+myj,m=2w−1
Strengths: Extremely simple, fast, easy to understand. One parameter (window size w ).
Weaknesses: All neighbors weighted equally (even distant ones). Broadens and flattens peaks. Not ideal for quantitative work.
Best for: Quick exploratory analysis, very broad features, teaching the concept of smoothing.
Weighted averaging where closer neighbors get more weight, following the Gaussian bell curve. One parameter: σ controls the width of the weighting function.
y^i=∑jw(j−i)∑jw(j−i)⋅yj,w(x)=e−x2/(2σ2)
Strengths: More natural weighting than moving average. Smooth results without blocky artifacts. Extends naturally to 2D data (images).
Weaknesses: Still broadens peaks (though less than moving average). Cannot compute derivatives.
Best for: Natural-looking smooth curves, 2D spectral imaging data, when simplicity is valued.
Frames smoothing as a penalized least squares optimization: find the smoothest curve that still fits the data. One parameter: λ balances fidelity to data against smoothness.
y^min{i∑(yi−y^i)2+λi∑(Δdy^i)2}
Strengths: Global optimization (considers the entire spectrum at once). Single intuitive parameter. Excellent peak preservation. Natural extension to weighted smoothing and baseline correction.
Weaknesses: Requires matrix operations (though fast with sparse matrices). Parameter λ spans orders of magnitude.
Best for: General-purpose smoothing. Baseline correction (via asymmetric variants). When a single-parameter method with excellent quality is desired.
The choice of smoothing method depends on your data, your analytical goal, and the nature of your spectral features.
Comparison table
Feature
Moving Average
Gaussian
Savitzky-Golay
Whittaker
Peak preservation
Poor
Moderate
Excellent
Excellent
Parameters
1 (window)
1 (sigma)
2 (window, order)
1-2 (lambda, d)
Computation
Very fast
Fast
Fast
Fast (sparse)
Derivatives
No
No
Yes
Not directly
Ease of use
Easiest
Easy
Moderate
Moderate
Best for
Quick exploration
Natural smoothing
Spectroscopy
General purpose
Beyond smoothing
Smoothing is the most common approach to noise reduction, but it is not the only one. Several other strategies are used in spectroscopy, either as alternatives to smoothing or in combination with it.
Ensemble averaging (signal averaging)
The most fundamental noise reduction technique: measure the same sample multiple times and average the spectra. If the noise is random and independent between measurements, averaging n spectra reduces the noise standard deviation by a factor of n :
σaveraged=nσsingle
This means 4 scans halve the noise, 16 scans reduce it by a factor of 4, and 100 scans reduce it by a factor of 10. The improvement follows the law of diminishing returns — each additional factor-of-two improvement requires four times as many scans.
Ensemble averaging is often the best first line of defense against noise, because it introduces no distortion at all (unlike smoothing, which always involves some tradeoff with resolution). Most FTIR instruments default to 16 or 32 co-added scans for exactly this reason. The limitation is measurement time: in process monitoring or kinetic studies, you may not have time for multiple scans.
Fourier filtering
The signal and noise often occupy different regions of the frequency domain. Spectroscopic features (peaks, baselines) correspond to low-frequency components, while random noise is spread across all frequencies. By transforming the spectrum to the frequency domain (via the Fast Fourier Transform, FFT), suppressing high-frequency components, and transforming back, you can reduce noise while preserving spectral features.
The practical challenge is choosing the frequency cutoff. Set it too low and you remove real spectral features along with the noise. Set it too high and too much noise remains. Interference fringes, which appear as sharp peaks in the frequency domain, can be selectively removed by this approach — a task that smoothing methods cannot accomplish.
Wavelet denoising
Wavelets decompose a signal into components at different scales (both frequency and position), offering a more flexible decomposition than Fourier analysis. Noise, which tends to produce small coefficients at fine scales, can be suppressed by thresholding: set small wavelet coefficients to zero and reconstruct the signal from the remaining coefficients.
Wavelet denoising can achieve excellent results, particularly for signals with sharp localized features (like Raman peaks) superimposed on smooth backgrounds. However, it requires choosing a wavelet family, a decomposition level, and a thresholding strategy, making it more complex to tune than standard smoothing methods. It is less commonly used in routine chemometric workflows but sees application in specialized contexts.
Median filtering
Median filtering replaces each point with the median (not the mean) of itself and its neighbors. Unlike mean-based smoothing, the median is robust to outliers. A single cosmic ray spike surrounded by normal values is completely eliminated by the median filter, whereas a moving average would only reduce it.
Median filtering is typically used as a preprocessing step to remove spikes before applying a standard smoothing method. It is not a general-purpose smoother because it can flatten broad features and create a staircase effect on smooth curves.
Measuring noise reduction
After applying any noise reduction method, you should verify that it worked as intended: noise was reduced without distorting the signal. Three approaches are standard.
Signal-to-noise ratio improvement
Measure the SNR before and after preprocessing. For a region of the spectrum where you know the signal should be approximately constant (e.g., a flat baseline region), compute:
SNR improvement=SNRbeforeSNRafter
For random noise, the theoretical SNR improvement from a moving average with window w is w . A window of 9 should improve SNR by a factor of 3. If you see much less improvement, the noise may not be random, or the window is too small.
Residual analysis
Subtract the smoothed spectrum from the original:
residuali=yi−y^i
Plot the residuals and inspect them:
Good smoothing: Residuals look like random noise — no visible patterns, approximately symmetric around zero, constant amplitude
Over-smoothing: Residuals show structured patterns (bumps that look like half a peak). You are removing signal.
Under-smoothing: Residuals still contain visible noise with higher amplitude than expected
Checking for signal distortion
Compare key spectral features before and after smoothing:
Peak positions: Did peaks shift? (They should not.)
Peak heights: Did peaks shrink? (Some reduction is expected with moving average and Gaussian; minimal with Savitzky-Golay and Whittaker.)
Peak widths: Did peaks broaden? (FWHM should remain approximately constant.)
Peak areas: Integration under peaks should be approximately preserved, even if heights change.
Resolution: Can you still resolve closely spaced peaks? Over-smoothing merges them.
Practical tips
Before smoothing
Diagnose your noise. Is it random (Gaussian), signal-dependent (shot noise), or systematic (baseline drift, fringes, spikes)? Smoothing is designed for random noise only.
Consider ensemble averaging first. If you can measure multiple scans, averaging is distortion-free and should be your first line of defense.
Remove spikes before smoothing. Cosmic ray spikes and outlier points should be detected and removed (e.g., by median filtering) before applying any smoothing method. A single spike can corrupt a large window of averaged values.
During smoothing
Start with mild smoothing and increase gradually. It is easier to add more smoothing than to undo over-smoothing.
Use the same preprocessing for all samples. Calibration, validation, and test sets must be smoothed with identical parameters. Inconsistent smoothing is a subtle but serious source of error.
Always compare before and after. Plot the original and smoothed spectra on the same axes. If you cannot see a difference, the smoothing is too mild. If the features look different, it may be too strong.
After smoothing
Check the residuals. Plot (original - smoothed). Residuals should look like random noise with no structure. If you see peak-shaped features in the residuals, you are removing signal.
Verify peak metrics. Compare peak positions, heights, and widths before and after. Significant changes indicate over-smoothing or an inappropriate method.
Document your parameters. Record the method, window size, polynomial order, or any other parameters used. Future users (including yourself six months from now) need to know exactly what was done.
Code implementation
The following example demonstrates how to calculate SNR and compare smoothing methods on a synthetic noisy spectrum. For detailed code for each individual method, see the dedicated articles linked above.
[1] Savitzky, A., & Golay, M. J. E. (1964). Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 36(8), 1627-1639.
[2] Eilers, P. H. C. (2003). A perfect smoother. Analytical Chemistry, 75(14), 3631-3636.
[3] Rinnan, A., van den Berg, F., & Engelsen, S. B. (2009). Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends in Analytical Chemistry, 28(10), 1201-1222.
[4] Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379-423.
[5] Wiener, N. (1949). Extrapolation, Interpolation, and Smoothing of Stationary Time Series. MIT Press.
[6] Mark, H., & Workman, J. (2007). Chemometrics in Spectroscopy. Academic Press.
[7] Brereton, R. G. (2003). Chemometrics: Data Analysis for the Laboratory and Chemical Plant. Wiley.
[8] Ingle, J. D., & Crouch, S. R. (1988). Spectrochemical Analysis. Prentice Hall.
[9] Donoho, D. L. (1995). De-noising by soft-thresholding. IEEE Transactions on Information Theory, 41(3), 613-627.
[10] Martens, H., & Naes, T. (1989). Multivariate Calibration. Wiley.