In this article, you’ll discover everything you need to know about Partial Least Squares. As the name indicates, Partial Least Squares is related to Ordinary Least Squares: the standard mathematical approach for fitting a Linear Regression.

The goal of Linear Regression is to model the dependence relationship between one dependent (target) variable and multiple independent (explanatory) variables. Ordinary Least Squares works great for this, as long as you meet the assumptions of Linear Regression.

In some domains, it may happen that you have a lot of independent variables in your model, of which many are correlated with other independent variables…

Multidimensional Scaling is a family of statistical methods that focus on creating mappings of items based on distance. Inside Multidimensional Scaling, there are methods for different types of data:

**Metric Multidimensional Scaling**, also called**Principal Coordinate Analysis**, is a subtype of Multidimensional Scaling that deals with*numerical distances*, in which there is*no measurement error*(you have exactly one distance measure for each pair of items).**Nonmetric Multidimensional Scaling**is a subcategory of Multidimensional Scaling that deals with*non-numerical distances between items*, in which there is*no measurement error*(you have exactly one distance measure for each pair of items).**…**

In this article, you will discover Principal Coordinate Analysis (PCoA), also known as **Metric Multidimensional Scaling (metric MDS)**. You’ll learn what Principal Coordinates Analysis is, when to use it, and how to implement it on a real example using Python and/or R.

Principal Coordinates Analysis is a statistical method that converts **data on distances between items** into **map-based visualization** of those items.

The generated mappings can be used for better **understanding which items are close to each other**, and which are different. It can also allow you to identify groups or clusters.

Before we get into the details, let’s first…

In this article, you will learn everything you need to know about Canonical Correlation Analysis. Canonical Correlation Analysis is a Multivariate Statistics technique that allows you to **analyze correlations between two datasets**.

Canonical Correlation Analysis can be used to model the correlations between two datasets in two ways:

Structural Equation Models are models that explain **relationships between measured variables and latent variables, **and **relationships between latent variables. **Latent variables are variables that, as humans, we understand as a concept, but that *cannot be measured directly*.

A great example of a latent variable that cannot really be measured directly is *Intelligence. W*e have plenty of school exams, IQ tests, psych tests to measure a concept like intelligence, but they always come down to:

- measuring a score on Multiple Choice questionnaires in multiple domains
- converting those scores into an estimate of Intelligence

So you will need a model to **convert…**

Imagine that you are interested in tracking the temperature throughout the day in your vegetable garden. You measure the temperature exactly every hour, and you end up with the following 24 temperature measurements:

Linear regression is a statistical model that allows to explain a dependent variable **y** based on variation in one or multiple independent variables (denoted **x**). It does this based on linear relationships between the independent and dependent variables.

In this article, I will quickly go over the linear regression model and I will cover the five assumptions that you need to check when doing a linear regression. I will cover theory and implementations in both R and Python.

Let’s start by describing a common use case for linear regression. We will be looking at a model in which we explain…

PCA, short for Principal Component Analysis, and Factor Analysis, are two statistical methods that are often covered together in classes on Multivariate Statistics.

In this article, you will discover the mathematical and practical differences between the two methods.

Multivariate Statistics is a group of statistical methods that focus on studying multiple variables together while focusing on the variation that those variables have in common.

Multivariate Statistics deals with the treatment of data sets with a large number of dimensions.

Its goals are therefore different from supervised modeling, but also different from segmentation and clustering models.

There are many models in…

In this article, I will explain how to add a basic security level to APIs that have been made using AWS API Gateway. This will be done through adding tokens.

Tokens are codes that you need to send with an API request, and that work more or less like a password. If your token allows you to access the data, the API will send you the data. Else it will send you an error.

To follow alongwith the example, you can use the AWS API Gateway API made in this previous article, which send back a randomly generated password. …

In this article, I will give a detailed overview of waiting line models. I will discuss when and how to use waiting line models from a business standpoint. In the second part, I will go in-depth into multiple specific queuing theory models, that can be used for specific waiting lines, as well as other applications of queueing theory.

Waiting line models are mathematical models used to study waiting lines. Another name for the domain is queuing theory.

Waiting lines can be set up in many ways. In a theme park ride, you generally have one line. In the supermarket, you…

Data Scientist — Machine Learning — R, Python, AWS, SQL