Assessing the validity of diagnostic tools: what is being tested?

Published in

Psyc 406–2016

2 min readFeb 2, 2016

Assessing the validity of diagnostic tools: what is being tested?

The reliability of many psychiatric diagnoses and testing thereof has been widely discussed, however, translating this theoretical critique to the assessment of measures used seems to be somewhat disconnected. Many of the solutions proposed to fix the issue overlook or minimize one of the problems with this ‘reliability’: one of the only validity measures available to evaluate psychological tests is to compare them to previously existing, potentially flawed, measures.

For example, in Abramowitz et. al’s assessment of the Dimensional Obsessive-Compusilve Scale (DOCS), they described a desire to create a better diagnostic tool to reduce certain types of over-specificity in questions while modifying the variables to fit with what their review deemed to be more significant. Generally, the steps in designing this measure seemed logical. Initial tests are designed to measure various aspects of the DSM-V’s diagnostic criteria, and then once data on the measures of these tests is compiled, they can be perfected and new ones created.

However, when evaluating the validity of this test, they chose to compare it to previous tests to see how well it correlated with them. Now, ideally, there would be some overlap– but isn’t the key evaluating the differences? Why is the only method of assessing reliability dependent on tests that you deem to be sufficiently unreliable to create a new form of self-report questionnaire?

This once again returns to a fundamental issue of most psychiatric diagnoses: they are symptom based, but the clusters are frequently up for debate. For the time being, the most “accurate” test as compared to the DSM-V would be one that took into account all criteria included within, however, in the end it’s difficult to verify the relationship between the questionnaire results and the disorders themselves if the symptomology of the disorder is still being evaluated; the current system seems to run on a degree of circular logic: the established scale may be more concordant with itself, thus creating a more reliable measure, but may differ from what the diagnostic symptomology is, while a psychological test with questions neatly fitting the DSM-V would be more compatible with the initial diagnosis, but less internally consistent, and potentially less useful in treatment. The problem with the former is that the logic could result in evaluating the test’s concordance with a previous test which had been assessed using the concordance with previous tests… etc. In the end, it could be possible to develop a measure far from the initial goal: however, that measure could potentially be useful in understanding the symptomology if certain aspects are narrowed down further than the official diagnosis.

If variables are found to be discordant, they can be omitted from the test despite being classified as part of the disorder, allowing the diagnosis to evolve. However, if one of one’s method of assessing the external validity of a newly constructed measure is to compare it to previous diagnostic tools, one must look more carefully at the key question: what is really being tested?

Written by Katherine Chaffey