A Lot of People, Myself Included, Have Been Misreading the Single Biggest Published Study on Childhood Gender Dysphoria Desistance and Persistence — It Offers Stronger Evidence for Desistance Than We Thought

8 min readMar 28, 2018

(Update, 3/28/2018: Changed headline from “Everyone” to “A Lot Of People” — on Twitter, some folks pointed out that I phrased that too strongly since there, of course, had been some people who carefully read the study in a way I didn’t at first, such as James Cantor here. URL is locked as-is unfortunately. I also made a few minor next-day tweaks to make certain language hedgier and less certain, but nothing that changes the gist and conclusions of this post.)

This is going to be a very nerdy post that will be indecipherable to anyone not at least knee-deep in the debate over childhood gender dysphoria persistence and desistance, so I’ll just jump right to the point: Just about everyone, myself included, has been misreading the single biggest published study on this subject in an important way.

In 2013, a team led by Thomas Steensma of the famed Center of Expertise on Gender Dysphoria at the VU University Medical Center Amsterdam, aka “The Dutch Clinic,” published a followup study of 127 patients who had attended the clinic as children. In the study, published in the Journal of the American Academy of Child and Adolescent Psychiatry, the authors explain that “As the Amsterdam clinic is the only gender identity service in the Netherlands where psychological and medical treatment is offered to adolescents with GD [gender dysphoria], we assumed that for the 80 adolescents (56 boys and 24 girls), who did not return to the clinic, that their GD had desisted, and that they no longer had a desire for gender reassignment.”

A plain interpretation of this sentence suggests the authors hadn’t been able to check with the 80 kids in question as to how gender dysphoric they were at followup—why else would they say they assumed their dysphoria had desisted? Writing on New York’s website, I echoed this interpretation and argued that “[t]his isn’t a bulletproof assumption, of course — maybe some of those patients moved to another country, or something — but every research article involves approximations, and it would be hard to come up with a storyline in which this group had enough persisters in it to nudge the overall numbers all that much.”

Others, particularly those who believe the concept of desistance poses a threat to trans kids and therefore must be undermined, disagreed. They used that assumption made by Steensma and his team as justification to both discredit this study and to hack away at the entire idea — an understandable strategy given that this appears to be both the largest and most recent study on the subject ever published (Devita Singh’s 2012 dissertation, while a very useful piece of a complicated puzzle that actually has a larger sample size, was never published).

Here’s Zack Ford, for example, in a ThinkProgress article with the headline “The pernicious junk science stalking trans kids” (the stalker in question is desistance):

A 2013 follow-up study led by Steensma, the most recent in the desistance pantheon, made the same assumption when 80 of its 127 participants did not return to the Amsterdam clinic.
“You don’t see that in any study,” [Colt] Keo-Meier said of drawing conclusions about participants who don’t actually complete the study. “I don’t have any idea how that got published! You just drop those people from your total percentages.”
[Michele] Angello and [Alisa] Bowman explain why this assumption is so misguided:

They might have switched doctors, moved, or worse, committed suicide. Also, it’s common for transgender people to express their true gender, face an abundance of ridicule and harassment, and then repress it.

It’s the equivalent of a dentist who assumes that if patients stop coming back, that means that they’re no longer getting cavities.

Other bloggers and writers followed suit — there’s a segment of participants in the youth-dysphoria debate for whom this study is viewed as completely worthless as a result of that admittedly sizable-sounding 80-person assumption.

The problem is, Angello, Bowman, Keo-Meier, Ford, and, unfortunately, myself… we’re all wrong. Completely wrong. Steensma and his colleagues never simply assumed those 80 kids had desisted — they got in touch with most of them, and, true to that ‘assumption,’ they weren’t dysphoric.

Now, we could be forgiven for thinking they simply assumed those 80 kids were desisters — that paragraph above really is written in a confusing way. But if you read the study closely — always read the study closely! — it’s clear this isn’t what happened. Here’s what’s in the very next paragraph: “All 47 persisters participated in the study. Of the 80 desisters, 46 adolescents sent back the questioners (57.5%) and 6 (7.5%) adolescents refused to participate, but allowed their parents to fill out the parent questionnaires. Twenty-eight adolescents were classified as nonresponders: 12 (15%) did not send back the questionnaires despite follow-up contacts, another 12 (15.0%) were untraceable. In 4 cases (5.0%), the adolescents and the parents indicated that the GD from the past remitted, but these individuals refused to participate.”

One paragraph they’re saying they assumed what happened to these kids, the next they’re saying they got data from them. Whuzzah? This all gets less confusing if you just take “persisters,” in this usage, to mean “kids who kept coming to the clinic” and “desisters” to mean “kids who stopped coming to the clinic.” It is confusing phrasing but in this sentence, and elsewhere in the paper, that’s basically what they mean. So what they’re saying here is that they were able to get in touch with 52 of the adolescents or their parents and get followup data from them — including, quite usefully for our purposes, measures of their gender dysphoria. Plus, four others didn’t want to participate but did say that their or their kids’ dysphoria had “remitted” — or desisted, if you like.

This was a dumb error on my part, to report that the researchers simply assumed those 80 kids had desisted — I think I took other people’s critiques at face value rather than read the study as closely as I should have. Those critiques, while understandable in light of that confusing sentence, were false.

Hell, there’s even a chart that makes it clear the authors heard back from all but 28 of their participants, one way or another:

As they say on Twitter, I’ll try to do better next time.

Anyway, so what did the 52 kids who stopped showing up at the clinic and their parents report to the clinicians once they were contacted?

Here we go:

Adolescents’ reports of GD and body image were compared across persisters and desisters (Table 4), and showed that persisters reported more GD than desisters in the mean total scores of both the GIIAA and the UGDS [two measures of gender dysphoria]. Clinically, for the GIIAA, scores of less than 3 indicate GD; 87.2% of the persisters met the criterion compared to 0% of the desisters. For the UGDS, scores of more than 40.0 indicate GD (Steensma, Kreukels, Jurgensen, Thyen, de Vries and Cohen-Kettenis, unpublished material, 2013); 97.9% of the persisters met the criterion compared to 2.2% of the desisters (1 bisexual, natal girl).” [bolding mine]

None of the kids who stopped coming to the clinic and who the authors were able to get in touch with had clinically significant gender dysphoria at followup, as judged by one measure (and just one did judged by the other). Zero for 52, or zero for 56 if you include those four partial-responders.

Now, it’s important to note that this group was less dysphoric at intake. Of the total 80 kids in the sample who stopped coming, 39.3% of boys and 58.3% of girls met the criteria for what used to be called Gender Identity Disorder, or GID. That’s way lower than the corresponding percentages for the kids who stayed in touch with the clinic as they grew up — 91.3% and 95.% percent, respectively. This shouldn’t surprise us! It makes perfect sense that the more gender dysphoric a kid is, the more likely they are to maintain regular contact with a gender identity clinic and to seek out its services. The kids who are there because Dad freaked out and overreacted when little — [does a Google search for the most common Dutch boys’ names] — Daan put on Mom’s dress once, but is a happy and healthy and non-dysphoric kid, aren’t going to keep coming.

It clarifies things greatly to realize so many of us done goofed on this. It paints a very different picture of this study. Particularly the “responders” column, which accounts for more than half of the 80 didn’t-come-back kids: about half of the boys and two-thirds of the girls were dysphoric enough to meet GID criteria at intake, and either none or one of them had clinically significant gender dysphoria at followup. (The “parents” column isn’t as interesting because none of the six kids met the GID criteria — sure, none of them had gender dysphoria at followup, but they’re not true desisters.)

As for the 24 nonresponders who didn’t provide any information (remember that 4 did just pipe up to say “nope, not dysphoric”), could some still be dysphoric? I… guess so? But given that the kids who didn’t come back but who did respond had about the same rates of meeting the clinical threshold for GID as the nonresponders, and none or one of them had clinically significant GD at followup, why would you assume that? What the clinicians proposed seems like a much safer assumption: They stopped coming because they are no longer suffering from gender dysphoria. Even if a few of the nonresponders still have gender dysphoria, it won’t affect the stats that much, anyway.

Overall, I do think that commonly cited 80% percentage might be too high an estimate for the rate of desistance from true childhood gender dysphoria. And of course the desistance-deniers will come back and claim that the GID criteria were so loose you can’t assume any of the kids who met it were gender dysphoric, which I think is silly given how those criteria actually read, but which is convenient since the only long-term studies we have on this are from the GID era.

Set aside all the noise, though, and the usual caveats about not overextrapolating from one study: The key question is whether for a significant percentage of kids, gender dysphoria abates in time. All the available evidence we have, limited though it may be, suggests it does, and reinterpreting Steensma (2013) in a more accurate light makes the case a bit stronger. WPATH and the American Psychological Association and the Substance Abuse and Mental Health Services Administration and the Endocrine Society all recognize that desistance occurs, and I have never spoken with a clinician who believes it to be a full-stop myth. There is good reason for this expert consensus.

All that said, it’s worth keeping in mind that this selfsame study also provides the intriguing nugget that strength of gender dysphoria at time of assessment predicted persistence in the long run. Which is one reason among many why the concept of desistance, on its own, should never be used as a justification for taking or recommending one course of action or another with a given gender dysphoric kid — as numerous clinicians have told me in my reporting on this endlessly fascinating and fraught subject, it’s vital to take an individualized approach, to get to know a kid and see where they’re at and what the sources of their distress are. Desistance is just one part of the puzzle. We shouldn’t ignore it, though, and it makes no sense to claim it’s a ‘myth.’

A Lot of People, Myself Included, Have Been Misreading the Single Biggest Published Study on Childhood Gender Dysphoria Desistance and Persistence — It Offers Stronger Evidence for Desistance Than We Thought

Written by Jesse Singal