Comment of the Day: Captain Domestic: “I’m biased as a MathWorks employee, but you may want to look into MATLAB…

…It is really strong in the kinds of data analysis and plotting that econ students need to do. MATLAB has a pretty non-programmer friendly editor and model that helps new users.

Jeff Dutky: “By all means steer your students away from spreadsheets…

…and toward more traditional programming languages (both R and MATLAB count as traditional programming languages here) that can be better documented, tested, and verified than a spreadsheet can. Building spreadsheets is akin to programming in assembly language: too little abstraction to be comprehensible on anything but toy problems.

Jorgensen: “Spreadsheets are a wonderful tool for simple mathematical and financial analysis…

I use spreadsheets regularly for small (<1,000 records) simple ad hoc flat file databases — and for that too they are a wonderful tool.The problems seem to come when people who are not really mathematically sophisticated build models in a spreadsheet that then grow in an undisciplined fashion to something that is unmanageable. I think that anyone going into economics or business should know how to use a spreadsheet but they need to be told of its limits and the need to move to more sophisticated tools for larger, more complicated, models.
JP Morgan lost $2 Billion dollars in the “London Whale” incident because of a programming error in a spreadsheet that had grown too large and complex.

Phil Koop: “It may be that you ought to have your students…

…do problem sets in R instead of Excel; but Lisa Pollack’s observation, fine as it is, has no bearing on the matter.
I develop software for a living. Software that does abstract statistical modeling, complex data transformation and presentation. I don’t use R for this but I sure as heck don’t use Excel. Nevertheless, I use Excel all the time for small side calculations as I work. Sometimes I might use a tool like R or Matlab to prototype a model. R can also be useful for vetting a model by duplicating a statistical algorithm cheaply. But often Excel is the best tool for model vetting. One reason for this is that it is often the best way to communicate an algorithm or clearly demonstrate one. This is only partly because of lingua franca; although systems implemented in spreadsheets are generally opaque, a spreadsheet is often the most transparent way to implement a single clearly defined algorithm.
So whether R or Excel is the best tool depends on the nature of the problems you set. Of course, if you goal is for your students to learn R, problems for which R is a poor tool may nevertheless be a good means to that end, just as a “hello world” program is a good way of learning certain structural aspects of a programming system, even though not the best way of printing the words “hello world.” Reply August 12, 2015 at 02:42 PM

Best Practices: “One of the standard practices in a good software development life cycle…

…is unit testing. There are unit testing frameworks for both R and Excel. Students should be expected to use these to show that their solutions produce correct outputs for a standard set of inputs relevant to the problem regardless of the language used. See: https://en.wikipedia.org/wiki/Unit_testing http://c2.com/cgi/wiki?TestDrivenDevelopment. Unit testing framework for R: http://sourceforge.net/projects/runit/ or Excel: http://www.sourcecodeonline.com/code/vba_tic_tac_toe-3.html

A: “That anything of importance depends on an Excel spreadsheet…

…calculation is really scary. (And do you really want to share Excel spreadsheets with macros enabled? May be a new route for spreading disasters.) — After programming in C++ I found MatLab easy to use (but it is commercial — requires you to pay). My daughter’s Biology (college) class had students draw some graphs with R, so that must be easy enough. — But I am surprised nobody mentioned Python https://www.python.org/ which comes with various scientific/plotting packages http://scipy.org/ and there is a nice (matLab-like) IDE called Spyder included e.g. in the ‘Anaconda’ distribution >https://store.continuum.io/cshop/anaconda/> I had thought that students in quantitative majors nowadays all know Python. (I even remember Prof. DeLong mentioning it in some post.)

Derrida Derider: “I’d defend Excel…

…It’s got quite good auditing, testing and documenting tools these days — BUT YOU HAVE TO USE THEM. Also it can handle much more complex and larger data than a few years ago. Sure if they go on to do more Econ or other quantitative subjects they’ll need other tools, but it is an amazingly flexible and versatile thing. Plus for good or ill they will definitely encounter it in their future workplace. But do tell your students you’ll require evidence they’ve structured, tested and documented their work.

Ryan: “I tend to use Stata…

…but we have a site license. I think R or Matlab is good in theory, and certainly good for an advanced audience. The question in my mind is whether you can list familiarity with the software as a prereq. If so, go for it. If not, how much time are you willing to part with to support and train up to some competency? Or, how many TAs do you have?

gb: “There are certain things that Excel does really, really well…

…The foremost being allowing you to see what is happening at every step of your code much more easily than you can in other languages — that is the flip side of being hard to replicate. There are still things that I will do at a toy level (for a very small subset of my data) in Excel before coding in Stata so that I can check that my code is doing what I want it to do. And there are types of errors that are simply easier to catch in Excel than in more complex programs. E.g. in writing destringing code to get cities and states/countries out of a locational field in a data set, my friend/coauthor/classmate unknowingly attributed all entries for the British West Indies to Indiana, because both had abbreviations ending IND.
And this after we’d been very careful to make sure that Virginia and West Virginia were properly separated (data from before modern postal abbreviations). It was only because I had spent enough time looking at the data in Excel that we were able to catch that particular error… which was small but far from trivial in the scope of the data.
I very rarely make errors when using Excel — not because I am inherently less prone to error than others, but because I build in exhaustive error-check mechanisms whenever possible. These days I mostly use Excel to organize data (and find OCR errors when copying/pasting from PDFs — it’s very good for that kind of thing) before importing to Stata, but working in Excel did teach me to visualize a number of things and to think through special cases. On multiple occasions I have seen other people who never worked in Excel make very sloppy errors in their Stata code that would be much more obviously wrong in Excel. I think that previous use of Excel helps me to avoid those errors when writing in Stata or Matlab myself.
Yes, I almost exclusively use Stata for research, precisely because it is so easy to replicate and share documentation. But I think that early knowledge of Excel is a very good starting point if not a good ending point if your goal is to write precise, fully thought out code.
In terms of teaching, I’d suggest two things:
(1) When using Excel, teach students to organize and clearly label their spreadsheets (I give them models to download and work with for in-class exercises in intermediate macro specifically to give them models of organization), and build in checks whenever possible. At least half the problem with Excel is that people label poorly and then don’t put in checks.
(2) Use Stata and Matlab in upper division classes, but not for lower division classes. More of the students in upper div classes are going into research, anyway, and they will have an easier time understanding Stata and Matlab for previous experience with Excel. A lot of the students in lower div classes are not going into research, and Excel will be their hammer and their nail. In this case it is a major social good to teach them proper formatting and error checking in spreadsheets — they are going to be using them no matter what, so let us teach our students to use them correctly.
(2b) Stop teaching programs like Gretel (used in econ 140 at UCB) just because they’re free. It’s worth $30 in course fees to have students buy Small Stata for 6 mos. and teach them a program they might actually use (ditto for Matlab — Stata has been more appropriate to what I was teaching but the same principle applies). Or if you are going to teach using free software, for the love of all that is holy use R, which also has a wide audience. When I have taught with Stata in upper div classes, a lot of students expressed frustration that they hadn’t used Stata in their initial metrics classes, and they clearly saw Gretel as a waste of time.