Heatmap Plots: Spatial Interpolation of Sensor Data

Rana Basheer
EdyzaIoT
Published in
7 min readOct 12, 2020

Heatmap like the one shown below for a 2000 sq.ft. grow facility is generated from 63 Edyza environmental sensors that are deployed around the grow room. Each white dot in the image below represents the position of a sensor collecting Barometric Pressure, Relative Humidity, and Temperature.

Humidity heatmap

Heatmaps are a powerful visualization tool that helps infer values of a parameter of interest from locations where a sensor could not be placed due to practical reasons. Additionally, overlaying spatial information with sensor data helps to show trends within the grow room, understand the interplay of the sensed parameters with physical structures, etc. E.g., in the above plot, the regions near the walls have higher humidity than the middle canopy area, which could cause condensation build-up on walls when the temperature falls below the dew point.

To create a smooth plot like the one above would require filling values (spatial interpolation) at gaps in sensor locations. There are several methods at disposal [1, 2, 3, 4], but we will only explore the two most commonly used techniques, Inverse Distance Weighting (IDW) [2] and Kriging [4]. They fall under two extremes of computational complexity, with IDW being a computationally easier solution that only needs current sensor data for its calculation. In contrast, Kriging is a statistical inference method that requires manipulating large matrices, analyzing historical data for noise estimation, etc. which can be computationally complex. However, Kriging's main advantage is that the solution it gives is the Best Linear Unbiased Estimate (BLUE) [4] for filling data, which is mathematically the optimal solution under certain assumptions, which I will get into in detail later. IDW, on the other hand, has no statistical guarantees on accuracy, and it was developed as a data smoothing technique for computer graphics [2].

The primary objective is to show how we could derive a new interpolation model that provides Kriging quality statistical guarantees but has the computational simplicity of IDW, thereby making it the best of the two worlds.

IDW Vs. Kriging

The similarity in IDW and Kriging is that they both use a linear model to arrive at the sensor data Yfor an unmeasured location l=(x,y,z) where x, y, and z are the Cartesian coordinates. If X= [X₁, X₂,…Xᵢ,..Xₙ] represent a vector of values from sensors located at positions marked I ∈ {1,2, …, n} then mathematically Yₗ can be expressed as

Yₗ = aₗ₁X₁+aₗ₂X₂+⋅⋅⋅⋅+aₗₙXₙ = ∑ⁿᵢ aₗᵢXᵢ

where aₗ=[aₗ₁,aₗ₂,…aₗᵢ,…,aₗₙ] are coefficients representing each measured sensor data's weight in estimating the final sensor value at an unmeasured location l.

The difference between IDW and Kriging arises when IDW makes a deterministic assumption about the coefficients aₗᵢ based on how far that unmeasured point l is from sensor i. In contrast, Kriging derives these coefficients by posing the above linear model as a least-squares problem where the coefficients aₗᵢ are the values that will result in the lowest estimation error in Yₗ.

To put things mathematically, for IDW, the coefficient aₗᵢ = M/rᵖₗᵢ where rₗᵢ is the radial distance from unmeasured location l to the sensor at position i, p is the power value that determines how fast the coefficient tapers off with distance, and M is the normalization factor that ensures ∑ⁿᵢ aₗᵢ=1. Therefore under IDW

Yₗ = M∑ⁿᵢ Xᵢ/rᵖₗᵢ

The above equation is very simple and straightforward. The only inputs needed to compute Yₗ at an unmeasured location l are the current sensor data vector X from the deployed sensors and their location. However, as mentioned earlier, this solution has no mathematical guarantees for accuracy.

Kriging formulates the problem as least-squares optimization with a constraint, i.e., the linear model has an estimation error (ϵ) term at the end, Yₗ = ∑ⁿᵢ aₗᵢXᵢ+ϵ and the constraint is ∑ⁿᵢ aₗᵢ=1 to result in unbiased estimates. The solution that will result in the lowest estimation error (ϵ) is [5]

(XX) is the covariance matrix between sensor data, (XYₗ) is the covariance vector between sensor data and a possible sensor value from the unmeasured location l, and λ is the Lagrange multiplier used for constrained optimization. Since we don’t have an actual sensor at the unmeasured location, (XYₗ) value must be realized from a covariance/distance model, which would look similar in any standard Kriging textbook to the graph below.

Standard Covariance/Distance Model Used For Kriging

In the above model, points that are close to each other have very similar sensor values (high covariance), and as the distance between an unmeasured point and a sensor increases, their similarity drops(low covariance). Additionally, the above model is directionless. The only factor that affects the covariance is the separation between the unmeasured point and the sensor and not the direction of that point from the sensor. However, when we looked at the indoor growth data, we quickly realized that the above classical Kriging model is completely useless.

Lights Control Everything in Indoor Grows

The primary driver that changes temperature or humidity in a grow-room is the lights. These lights are operated on a schedule, and they turn on and off at the same time. Consequently, all the sensed values will show very high covariance even if placed at two diagonal ends of a grow room. The below plot shows the highly correlated temperature change over a 24 hour period when the lights go through their daily cycle.

Highly correlated temperature variation

The highly correlated sensor data possess a unique challenge to the least-square solution. The covariance matrix (XX) is close to being singular, resulting in (XX)⁻¹ being non-invertible or absurdly huge value. Additionally, we discovered that (XᵗYₗ) does not have a smooth uniform variation within the grow. Our data showed that the covariance between sensor values not only changed unevenly with distance but also had a robust directional component. This necessitated us to explore alternate models for describing the observed covariance change in an indoor grows.

IDW for Covariance Modeling in Indoor Grows

To model the uneven covariance in an indoor grow, we decided to use a smoothing function like IDW. As explained earlier, smooth functions help fill gaps without making any assumption about the filled data. In essence, we kicked the can one step down by using a smoothing function for the second moment (covariance) rather than the first moment (mean), as is done in regular IDW. Therefore, the covariance between sensor Xᵢ and Yₗ (an unmeasured location) can be derived from our new model as

Cov(XᵢYₗ) = M[Cov(X₁,Xᵢ)/rᵖ₁ₗ+Cov(X₂,Xᵢ)/rᵖ₂ₗ+⋅ ⋅ ⋅ + Cov(Xₙ,Xᵢ)/rᵖₙₗ]

Cov(XᵢYₗ) =M∑ ⱼⁿCov(Xⱼ, Xᵢ), ∀ j∈ (1⋅ ⋅ ⋅ n)

where M is the normalization factor, rⱼₗ is the radial distance between sensor j∈{1,2,⋅⋅⋅,n} and unmeasured location l and p is the power factor that determines how fast the covariance tapers off with distance such that if rⱼₗ =0 then Cov(XᵢYₗ)=Cov(Xᵢ, Xⱼ) since Yₗ=Xⱼ.

The above covariance model in a grow room, where the sensor data are highly correlated, gives rise to an exciting solution for coefficients aₗᵢ that is needed for estimating Yₗ from the linear model presented at the start. The solution was computationally found out to be aₗᵢ = M/rᵖₗᵢ . In short, assuming an IDW model for the covariance/distance model for kriging under highly correlated sensor data results in an IDW style solution for coefficients that estimates Yₗ.

aₗᵢ = M/rᵖₗᵢ is the Best Linear Unbiased Estimator (BLUE) under an IDW covariance model

Having a covariance/distance model now enables us to come up with boundaries for our estimation error. The estimation error variance in Yₗ is given by

where σ²=Cov(X²ᵢ) is the variance of the sensor data, which is assumed to be more or less the same for all the sensors as this is determined by the accuracy of sensor modules used in our wireless sensors solution.

An exploded view of our wireless Edyza Environment sensor is shown below. You can read up on the sensor specification here [EZ-SEV101]

References

  1. W. R. Franklin. Triangulated irregular network to approximate digital terrain, Section 2.3, Research Interests. Technical report, Electrical, Computer, and Systems Engineering Dept., Rensselaer Polytechnic Institute, Troy, NY, 1994. Manuscript and code available on ftp://ftp.cs.rpi.edu/pub/franklin/
  2. Shepard, Donald (1968). “A two-dimensional interpolation function for irregularly-spaced data.” Proceedings of the 1968 ACM National Conference. pp. 517–524.
  3. Lusting, L. K., 1969. Trend-surface analysis of the Basin and Range Province, and some geomorphic implications, U.S. Geol. Survey Prof. Paper 500-D, 70p
  4. Krige, D.G, A statistical approach to some mine valuations and allied problems at the Witwatersrand, Master’s thesis of the University of Witwatersrand, 1951
  5. Bailey, Trevor, and Gatrell, Anthony (1995). “Ordinary Kriging.” Interactive spatial data analysis. Longman Scientific & Technical, New York, Ch.5.5

--

--