# ContextBase Predictive Analytics

## John Akwei, ECMp ERMp Data Scientist

### Predictive Analytics Example 1: Linear Regression

Linear Regression allows for prediction of future occurrences derived from one explanatory variable, and one response variable.

`revenue <- data.frame(revenue)model <- lm(market_potential~price_index, revenue)cat("The Intercept =", model\$coefficients[1])`
`## The Intercept = 15.21788`
`test <- data.frame(price_index=4.57592)result <- predict(model, test)`

### Example 1 — Linear Regression Conclusion:

`cat("For a Price Index of ", as.character(test), ", the predicted Market Potential = ", round(result, 2), ".", sep="")`
`## For a Price Index of 4.57592, the predicted Market Potential = 13.03.`

### Predictive Analytics Example 2: Logistic Regression

Logistic Regression allows for prediction of a logical, (Yes or No), occurrence based on the effects of an explanatory variable on a response variable. For example, the probability of winning a congressional election vs campaign expenditures.

How does the amount of money spent on a campaign affect the probability that the candidate will win the election?

`Expenditures <- c(1000000, 1100000, 1200000, 1300000, 1400000,                  1500000, 1600000, 1700000, 1800000, 1900000,                  2000000, 2100000, 2200000, 2300000, 2400000,                  2000000, 2100000, 2200000, 2300000, 2400000)ElectionResult <- c(0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,    1, 1, 1, 1, 1)`
`CampaignCosts <- data.frame(Expenditures, ElectionResult)`

The logistic regression analysis gives the following output:

`model <- glm(ElectionResult ~., family=binomial(link='logit'), data=CampaignCosts)model\$coefficients`
`##   (Intercept)  Expenditures ## -7.615054e+00  4.098080e-06`

The output indicates that campaign expenditures significantly affect the probability of winning the election. The output provides the coefficients for Intercept = -7.615054e+00, and Expenditures = 4.098080e-06. These coefficients are entered in the logistic regression equation to estimate the probability of winning the election:

Probability of winning election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*CampaignExpenses)))

For a Candidate that has \$1,600,000 in expenditures:

`CampaignExpenses <- 1600000ProbabilityOfWinningElection <- 1/(1+exp(-(-7.615054e+00+4.098080e-06*CampaignExpenses)))`
`cat("Probability of winning Election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*",    CampaignExpenses, "))) = ", round(ProbabilityOfWinningElection, 2), ".", sep="")`
`## Probability of winning Election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*1600000))) = 0.26.`

For a Candidate that has \$2,100,000 in expenditures:

`CampaignExpenses <- 2100000ProbabilityOfWinningElection <- 1/(1+exp(-(-7.615054e+00+4.098080e-06*CampaignExpenses)))`
`cat("Probability of winning Election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*",    CampaignExpenses, "))) = ", round(ProbabilityOfWinningElection, 2), ".", sep="")`
`## Probability of winning Election = 1/(1+exp(-(-7.615054e+00+4.098080e-06*2100000))) = 0.73.`

### Example 2 — Logistic Regression Conclusion:

`ElectionWinTable <- data.frame(column1=c(1100000, 1400000,                                         1700000, 1900000,                                         2300000),                               column2=      c(round(1/(1+exp(-(-7.615054e+00+4.098080e-06*1100000))), 2),      round(1/(1+exp(-(-7.615054e+00+4.098080e-06*1400000))), 2),      round(1/(1+exp(-(-7.615054e+00+4.098080e-06*1700000))), 2),      round(1/(1+exp(-(-7.615054e+00+4.098080e-06*1900000))), 2),      round(1/(1+exp(-(-7.615054e+00+4.098080e-06*2300000))), 2)))names(ElectionWinTable) <- c("Campaign Expenses", "Probability of Winning Election")`

### Predictive Analytics Example 3: Multiple Regression

Multiple Regression allows for the prediction of the future values of a response variable, based on values of multiple explanatory variables.

`input <- data.frame(state.x77[,1:4])`
`colnames(input) <- c("Population", "Income", "Illiteracy", "Life_Exp")# Create the relationship model.model <- lm(Life_Exp~Population+Income+Illiteracy, data=input)# Show the model.print(model)`
`## ## Call:## lm(formula = Life_Exp ~ Population + Income + Illiteracy, data = input)## ## Coefficients:## (Intercept)   Population       Income   Illiteracy  ##   7.120e+01   -1.024e-05    2.477e-04   -1.179e+00`
`a <- coef(model)[1]cat("The Multiple Regression Intercept = ", a, ".", sep="")`
`## The Multiple Regression Intercept = 71.2023.`
`XPopulation <- coef(model)[2]XIncome <- coef(model)[3]XIlliteracy <- coef(model)[4]modelCoef <- data.frame(XPopulation, XIncome, XIlliteracy)colnames(modelCoef) <- c("Population", "Income", "Illiteracy")row.names(modelCoef) <- c("Coefficients")`

### Multiple Regression Conclusion:

`popl <- 3100Incm <- 5348Illt <- 1.1`
`Y = a + popl * XPopulation + Incm * XIncome + Illt * XIlliteracycat("For a City where Population = ", popl, ", Income = ", Incm,  ", and Illiteracy = ", Illt, ",the predicted Life Expectancy is: ", round(Y, 2), ".", sep="")`
`## For a City where Population = 3100, Income = 5348, and Illiteracy = 1.1,## the predicted Life Expectancy is: 71.2.`

### In conclusion to ContextBase Predictive Analytics Example 3, the multiple variables of “Population”, “Income”, and “Illiteracy” were used to determine the predicted “Life Expectancy” of an area corresponding to a USA State. For an area with a Population of 3100, a per capita Income Rate of 5348, and an Illiteracy Rate of 1.1, a Life Expectancy of 71.2 years was predicted.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.