Least Squares Optimised Fit Using Python— A Basic Guide

Praveen Jayasuriya
Analytics Vidhya
Published in
4 min readFeb 15, 2021

--

Model vs data

How do we choose a reasonable starting point when modeling some data? In the context of statistical inference, this question takes on a prominent dimension as we typically begin our analysis with a fairly simple model that represents the system or process, with reasonable accuracy. This model can then be used to perform a nested sampling operation or equivalent such that we obtain a posterior pdf or estimate.

Often, when building complex models from data, it can be useful to start the process with a least-squares optimized model that fits data obtained from an experiment as a baseline/starting point.

I will consider a fairly straightforward example of instrument data that appears to fit a mixture model comprising three Gaussian distributions and a single Chi-squared distribution. In this example, the choice of these distributions is specific to the domain being considered and can be any mixture of functions relevant to the dataset you may be considering.

The instrument data looks like this,

Instrument data

The plot looks quite asymmetric with a long tail beyond 1.5. The black vertical lines along the x-axis are a rug plot…

--

--