Data Science One on One — Part 5: Population Regression Function and Error Term

5 min readNov 21, 2021

“There will be deviations from the expected value of the dependent variable called error terms, which represent the effect of independent variables not included in the population regression function.”

Population Regression Function

Assuming that the 30 observations from the previous example represent the population of hedge funds that tare in the same class (i.e., have the same basic investment strategy) then their relationship can provide a population regression function.

Such function would consist of parameters called regression coefficients. The regression equasion (or function) will include an intercept term and one slope coefficient for each independent variable.

For this simple two-variable case, the function is:

E(return | lockup period) = B0 + B1×(lockup period)

Or more generally:

E(Yi | Xi) = B0 + B1×(Xi)

In the equation, B0 is the intercept coefficient, which is the expected value of the return if X = 0. B1 is the slope coefficient, which is the expected change in Y for unit change in X.

In this example, for each additional year of lockup, a hedge fund is expected to earn an additional B1 per year in return.

The Error Term

There is a dispersion of Y-values around each conditional expected value. The difference between each Y and its corresponding conditional expectation (i.e., the line that fits the data) is the error term or noise component denoted as εi.

εi = Yi - E(Yi | Xi)

The deviation from the expected value is the result of factors other than the included X-variable. One way to break down the equation is to say that

E(Yi | Xi) = B0 + B1 × Xi

is the deterministic or systematic component, and εi is the non systematic or random component. The error term provides another way of expressing the population regression function:

Yi = B0 + B1× Xi + εi

The error term represents effects from independent variables not included in the model.

In the case of the hedge fund example, εi is probably a function of the individual manager’s unique trading tactics and management activities within the style classification. Variables that might explain this error term are the number of positions and trades a manager makes over time.

Another variable might be the years of experience of the manager. A data scientist may need to include several of these variables (e.g., trading style and experience) into the population regression function to reduce the error term by a noticeable amount. Often, it is found that limiting an equation to one or two independent variables with the most explanatory power is the best choice.

Summary

A population regression line indicates the expected value of a dependent variable conditional on one or more independent variables:

E(Yi | Xi) = B0 + B1× (Xi)

The difference between an actual dependent variable and a given expected value is the error term or noise component denoted

εi = Yi - E(Yi | Xi)

About the Author

Roi Polanitzer, PDS, ADL, MLS, PDA, CPD, F.IL.A.V.F.A., FRM, is a data scientist with an extensive experience in solving machine learning problems, such as: regression, classification, clustering, recommender systems, anomaly detection, text analytics & NLP, and image processing. Mr. Polanitzer is is the Owner and Chief Data Scientist of Prediction Consultants — Advanced Analysis and Model Development, a data science firm headquartered in Rishon LeZion, Israel. He is also the Owner and Chief Appraiser of Intrinsic Value — Independent Business Appraisers, a business valuation firm that specializes in corporates, intangible assets and complex financial instruments valuation.

Over more than 16 years, he has performed data sceince projects such as: regression (e.g., house prices, CLV- customer lifetime value, and time-to-failure), classification (e.g., market targeting, customer churn), probability (e.g., spam filters, employee churn, fraud detection, loan default, and disease diagnostics), clustering (e.g.,customer segmentation, and topic modeling), dimensionality reduction (e.g., p-values, itertools Combinations, principal components analysis, and autoencoders), recommender systems (e.g., products for a customer, and advertisements for a surfer), anomaly detection (e.g., supermarkets’ revenue and profits), text analytics (e.g., dentifying market trends, web searches), NLP (e.g., sentiment analysis, cosine similarity, and text classification), image processing (e.g., image binary classification of dogs vs. cats, , and image multiclass classification of digits in sign language), and signal processing (e.g., audio binary classification of males vs. females, and audio multiclass classification of urban sounds).

Mr. Polanitzer holds various professional designations, such as a global designation called “Financial Risk Manager” (FRM, which indicates that its holder is proficient in developing, implementing and validating statistical models and mathematical algorithms such as K-Means, SVM and KNN for credit risk measurement and management) from the Global Association of Risk Professionals (GARP), a designation called “Fellow Actuary” (F.IL.A.V.F.A., which indicates that its holder is proficient in developing, implementing and validating statistical models and mathematical algorithms such as GLM, RF and NN for determining premiums in general insurance) from the Israel Association of Valuators and Financial Actuaries (IAVFA), and a designation called “Certified Risk Manager” (CRM, which indicates that its holder is proficient in developing, implementing and validating statistical models and mathematical algorithms such as DT, NB and PCA for operational risk management) from the Israeli Association of Risk Managers (IARM).

Mr. Polanitzer had studied actuarial science (i.e., implementation of statistical and data mining techniques for solving time-series analysis, dimensionality reduction, optimization and simulation problems) at the prestigious 250-hours training program of the University of Haifa, financial risk management (i.e., building statistical predictive and probabilistic models for solving regression, classification, clustering and anomaly detection) at the prestigious 250-hours training program of the program of the Ariel University, and machine learning and deep learning (i.e., building recommender systems and training neural networks for image processing and NLP) at the prestigious 500-hours training program of the John Bryce College.

He had graduated various professional trainings at the John Bryce College, such as: “Introduction to Machine Learning, AI & Data Visualization for Managers and Architects”, “Professional training in Practical Machine Learning, AI & Deep Learning with Python for Algorithm Developers & Data Scientists”, “Azure Data Fundamentals: Relational Data, Non-Relational Data and Modern Data Warehouse Analytics in Azure”, and “Azure AI Fundamentals: Azure Tools for ML, Automated ML & Visual Tools for ML and Deep Learning”.

Mr. Polanitzer had also graduated various professional trainings at the Professional Data Scientists’ Israel Association, such as: “Neural Networks and Deep Learning”, “Big Data and Cloud Services”, “Natural Language Processing and Text Mining”.

Data Science One on One — Part 5: Population Regression Function and Error Term

Population Regression Function

The Error Term

Summary

About the Author

Written by Roi Polanitzer