You raise important questions about the general problem of inferring causal relationships using…

1 min readSep 30, 2016

You raise important questions about the general problem of inferring causal relationships using observable measures as proxies (e.g., IQ tests) for unobserved phenomena (e.g., “real intelligence”), or more generally when you’ve got some B’ you’ve constructed/encoded/engineered to represent B (per your example).

This is perhaps an under appreciated topic, but there has been some work on the issue. Generally, what conditions need to hold between something like a psychometric scale (B’) and the underlying phenomena of interest (B) to preserve conditional independence relationships for causal inference when B “blocks” or “screens-off” two other variables (but B’ may not)? I scratched the surface of this and related topics about engineering/constructing features from complex, raw datasets in my graduate work at Carnegie Mellon (shameless self-promotion, my dissertation can be found here: http://repository.cmu.edu/cgi/viewcontent.cgi?article=1398&context=dissertations).

Chapter 2 (Section 2) summarizes a bit of early work I did with respect to psychometric scales & some traditional approaches to measuring reliability of those scales and how that all relates to causal inference (and conditional independence preservation). Other parts of the work may be of interest, too, given that you’ve raised a lot of the questions that got me interested in this very topic.

Another option for psychometric and other settings is to consider is explicitly modeling latent variables within the causal graph and learn structure amongst the latent variables; this raises important issues about the measurement model that relates observed measures to latent variables. For work on this, see http://www.jmlr.org/papers/volume7/silva06a/silva06a.pdf

There is other work out there, but these pointers will hopefully be helpful to start.

Keep up the good work!

Written by Stephen Fancsali