Smoothing data with Julia’s @generated functions

Another use case for generated functions

Smoothing data using a Savitzky-Golay filter

Implementation of Savitzky-Golay filters in Julia using generated functions.

What this code do?

  1. The generated function does some linear algebra to determine what values to insert into the desired method body, first calculating the interpolation coefficients C from M and N by constructing J, the Jacobian (a.k.a. design matrix) of the filter, and then extracting the first row of its pseudoinverse by doing a least-squares solve on the canonical basis vector e1. The interpolation coefficients are then spliced into the generated expression using the $(C[k]) dollar-sign syntax for expression interpolation.
  2. 2M conditional additions are also inserted into the abstract syntax tree in expr before the function body gets compiled. M determines the number of terms in the interpolating expression, and at the end points the expression must be truncated to avoid going out of bounds when indexing into the data. (expr.args[6].args[2].args[2] is the particular tree traversal that gets us to the appropriate place to insert new leaf nodes into the AST.)
  3. Some simple type arithmetic is needed to determine To, the element type of the output vector. Not all input vectors have element types that are closed under the interpolation process, which require taking linear combinations with floating point coefficients. (The Savitzky-Golay coefficients are actually rational, but proving this to be true remains elusive…) Since the output type can be determined when the generated function is called, there is no need to do the type computation at run time; instead, it can be hoisted into the generated function as part of the code generation process.

Call overloading

Inspecting exprs as they are constructed

expr = quote # /Users/jiahao/savitzkygolay.jl, line 28:
n = size(data,1) # /Users/jiahao/savitzkygolay.jl, line 29:
smoothed = zeros(Float64,n) # /Users/jiahao/savitzkygolay.jl, line 30:
@inbounds for i = eachindex(smoothed)
if i — 2 ≥ 1 # /Users/jiahao/savitzkygolay.jl, line 39:
smoothed[i] += -0.08571428571428567 * data[i — 2]
end
if i — 1 ≥ 1 # /Users/jiahao/savitzkygolay.jl, line 39:
smoothed[i] += 0.34285714285714275 * data[i — 1]
end # /Users/jiahao/savitzkygolay.jl, line 31:
smoothed[i] += 0.4857142857142856 * data[i]
if i + 1 ≤ n # /Users/jiahao/savitzkygolay.jl, line 44:
smoothed[i] += 0.3428571428571427 * data[i + 1]
end
if i + 2 ≤ n # /Users/jiahao/savitzkygolay.jl, line 44:
smoothed[i] += -0.08571428571428563 * data[i + 2]
end
end # /Users/jiahao/savitzkygolay.jl, line 33:
smoothed
end
SavitzskyGolayFilter{2,3}([1:11]) = [0.9142857142857141,1.9999999999999996,3.0,3.9999999999999987,4.99999999999
9998,5.999999999999998,6.999999999999998,7.999999999999998,8.999999999999998,11.028571428571425,7.9999999999999
98]
expr = quote # /Users/jiahao/savitzkygolay.jl, line 28:
n = size(data,1) # /Users/jiahao/savitzkygolay.jl, line 29:
smoothed = zeros(Float64,n) # /Users/jiahao/savitzkygolay.jl, line 30:
@inbounds for i = eachindex(smoothed)
if i — 1 ≥ 1 # /Users/jiahao/savitzkygolay.jl, line 39:
smoothed[i] += 0.3333333333333332 * data[i — 1]
end # /Users/jiahao/savitzkygolay.jl, line 31:
smoothed[i] += 0.3333333333333333 * data[i]
if i + 1 ≤ n # /Users/jiahao/savitzkygolay.jl, line 44:
smoothed[i] += 0.3333333333333334 * data[i + 1]
end
end # /Users/jiahao/savitzkygolay.jl, line 33:
smoothed
end
SavitzskyGolayFilter{1,1}([1:11]) = [1.0000000000000002,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0,6.999999999999998]
  1. The filter reasonably reproduces the original [1:11;] data set with reasonable fidelity, except at the very end where it starts to clamp the data. This is a well known limitation of polynomial interpolation that results from truncation — there simply aren’t enough data points at the end points for a reliable interpolation using the same polynomial used for the bulk. A proper treatment of the end points is needed.
  2. SavitzskyGolayFilter{M,1} reduces to a simple moving average. Thus one interpretation of the Savitzsky-Golay filter is that it generalizes the moving average to also preserve higher moments other than just the mean.

Similar constructs in other languages

Multistaged programming (MSP)

Parameteric polymorphism of method families

--

--

--

AI Research Director, JPMorgan AI Research, New York (HIRING US/UK). Also Research Scientist, MIT CSAIL; Problematic Developer of the Julia language.

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jiahao Chen [he/him] / 陳家豪 [他]

Jiahao Chen [he/him] / 陳家豪 [他]

AI Research Director, JPMorgan AI Research, New York (HIRING US/UK). Also Research Scientist, MIT CSAIL; Problematic Developer of the Julia language.

More from Medium

Apache Spark : Getting Less Data While Performing Comparison Operators.

Azure Data Factory Basic Concepts

Glicko rating system: A new method of evaluating sprinting performance

Installing Pyspark in M1 Mac