# Smoothing data using a Savitzky-Golay filter

## What this code do?

1. The generated function does some linear algebra to determine what values to insert into the desired method body, first calculating the interpolation coefficients C from M and N by constructing J, the Jacobian (a.k.a. design matrix) of the filter, and then extracting the first row of its pseudoinverse by doing a least-squares solve on the canonical basis vector e1. The interpolation coefficients are then spliced into the generated expression using the \$(C[k]) dollar-sign syntax for expression interpolation.
2. 2M conditional additions are also inserted into the abstract syntax tree in expr before the function body gets compiled. M determines the number of terms in the interpolating expression, and at the end points the expression must be truncated to avoid going out of bounds when indexing into the data. (expr.args[6].args[2].args[2] is the particular tree traversal that gets us to the appropriate place to insert new leaf nodes into the AST.)
3. Some simple type arithmetic is needed to determine To, the element type of the output vector. Not all input vectors have element types that are closed under the interpolation process, which require taking linear combinations with floating point coefficients. (The Savitzky-Golay coefficients are actually rational, but proving this to be true remains elusive…) Since the output type can be determined when the generated function is called, there is no need to do the type computation at run time; instead, it can be hoisted into the generated function as part of the code generation process.

## Inspecting exprs as they are constructed

`expr = quote # /Users/jiahao/savitzkygolay.jl, line 28:    n = size(data,1) # /Users/jiahao/savitzkygolay.jl, line 29:    smoothed = zeros(Float64,n) # /Users/jiahao/savitzkygolay.jl, line 30:    @inbounds for i = eachindex(smoothed)            if i — 2 ≥ 1 # /Users/jiahao/savitzkygolay.jl, line 39:                smoothed[i] += -0.08571428571428567 * data[i — 2]            end            if i — 1 ≥ 1 # /Users/jiahao/savitzkygolay.jl, line 39:                smoothed[i] += 0.34285714285714275 * data[i — 1]            end # /Users/jiahao/savitzkygolay.jl, line 31:            smoothed[i] += 0.4857142857142856 * data[i]            if i + 1 ≤ n # /Users/jiahao/savitzkygolay.jl, line 44:                smoothed[i] += 0.3428571428571427 * data[i + 1]            end            if i + 2 ≤ n # /Users/jiahao/savitzkygolay.jl, line 44:                smoothed[i] += -0.08571428571428563 * data[i + 2]            end    end # /Users/jiahao/savitzkygolay.jl, line 33:    smoothedendSavitzskyGolayFilter{2,3}([1:11]) = [0.9142857142857141,1.9999999999999996,3.0,3.9999999999999987,4.999999999999998,5.999999999999998,6.999999999999998,7.999999999999998,8.999999999999998,11.028571428571425,7.999999999999998]expr = quote # /Users/jiahao/savitzkygolay.jl, line 28:    n = size(data,1) # /Users/jiahao/savitzkygolay.jl, line 29:    smoothed = zeros(Float64,n) # /Users/jiahao/savitzkygolay.jl, line 30:    @inbounds for i = eachindex(smoothed)            if i — 1 ≥ 1 # /Users/jiahao/savitzkygolay.jl, line 39:                smoothed[i] += 0.3333333333333332 * data[i — 1]            end # /Users/jiahao/savitzkygolay.jl, line 31:            smoothed[i] += 0.3333333333333333 * data[i]            if i + 1 ≤ n # /Users/jiahao/savitzkygolay.jl, line 44:                smoothed[i] += 0.3333333333333334 * data[i + 1]            end    end # /Users/jiahao/savitzkygolay.jl, line 33:    smoothedendSavitzskyGolayFilter{1,1}([1:11]) = [1.0000000000000002,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,6.999999999999998]`
1. The filter reasonably reproduces the original [1:11;] data set with reasonable fidelity, except at the very end where it starts to clamp the data. This is a well known limitation of polynomial interpolation that results from truncation — there simply aren’t enough data points at the end points for a reliable interpolation using the same polynomial used for the bulk. A proper treatment of the end points is needed.
2. SavitzskyGolayFilter{M,1} reduces to a simple moving average. Thus one interpretation of the Savitzsky-Golay filter is that it generalizes the moving average to also preserve higher moments other than just the mean.

# Similar constructs in other languages

--

--

--

## More from Jiahao Chen [he/him] / 陳家豪 [他]

AI Research Director, JPMorgan AI Research, New York (HIRING US/UK). Also Research Scientist, MIT CSAIL; Problematic Developer of the Julia language.

Love podcasts or audiobooks? Learn on the go with our new app.

## Jiahao Chen [he/him] / 陳家豪 [他]

AI Research Director, JPMorgan AI Research, New York (HIRING US/UK). Also Research Scientist, MIT CSAIL; Problematic Developer of the Julia language.