DataScience for Developers: Build your first predictive model with R
David Salgado
344

How does this scale up?

I worked for some years with a company that did predictive modeling, basically using a highly sophisticated simultaneous equation solver — originally proprietary, written by MIT profs. Later, as support for that dwindled, they went to a commercial solution.

The thing is, those models had up to 2 million variables and a half-million equations, and still could be optimized in an hour or two on modern hardware (and that is without any parallelism).

Would there even be a way to model such a thing and solve it with any of the suggested methods here? If so, I’m sure there are gotchas in such a large model — what would they be?