Data Science 101: Is Python better than R?

Enoch Kan

1053

Thank you for sharing.

I recommend that you use `data.table::fread` instead of `read.csv` in R.

And I think your R code for bootstrap can be modified significantly, as below.

boot <- function(n){

b1 <- numeric(n)

b1[1] <- coef(fit.mod)[2]

samples <- replicated(n-1, sample(errors, replace=FALSE))

yboots <- samples + yhat

b1[2:n] <- coef(lm(yboots ~ x))[2, ]

return(b1)

}

The counterpart python codes costs 37 seconds to run, while the original R code costs 80+ seconds. But my modified R code just takes 0.8 seconds.

Never underestimate R’s potential before you vectorize the computation.