Barbara Streisand meets Data Colada: The story of what happened post-publication after I revisited a very popular study in bilingualism research

Evelina Leivada
15 min readNov 6, 2023

--

Disclaimer: In recent days, we have seen scholars suing other scholars, demanding no less than $25M (25 millions, let this number sink!) in damages, for expressing concerns about irregularities in the data, omissions, cherry-picking, and possible tampering. Just to be sure, I make no accusations against anyone, and I can only speak confidently about my own work. Now if you are still tempted to settle any possible disagreement you may have with me in court, I feel obligated to disclose that one of my favorite childhood movies was A Few Good Men. As a child, I used to pose in front of the mirror, adopting Jack Nicholson’s stern expression as Colonel Jessup: “Do you want the truth? You can’t handle the truth. Son, I have a greater responsibility than you can possibly fathom.”. So, if you sue me, know that I’m experienced, and I’ll likely deliver all my A Few Good Men lines in court in a very convincing way.

I first came across Data Colada this summer. Their first Data Falsificada post had just made Twitter/X explode, shaking the scientific community. Three more posts followed. In these posts, Uri Simonsohn, Leif Nelson and Joe Simmons present and analyze evidence of possible data tampering in four retracted papers co-authored by award-winning Harvard Business School professor Francesca Gino (now in administrative leave according to her Harvard webpage). This case and the type of evidence they adduced to support their claims made a huge impression on me; I started browsing all the older Data Colada posts, until reaching back to their very first one.

As fate had it, around the same time, I was also compiling notes on the topic of publication biases in bilingualism research. My starting point was a very well-known and well-cited paper in the field, which I have also cited repeatedly in previous work: Cognitive advantage in bilingualism: An example of publication bias? (by Angela de Bruin, Barbara Treccani & Sergio Della Sala. Published in Psychological Science 2015, 26, 99–107). This paper claims that the notion of a bilingual advantage in executive control may stem from a publication bias that favors studies finding evidence in support of the bilingual advantage hypothesis.

To test this hypothesis, the three authors followed a clever strategy: they looked at conference abstracts from 1999 to 2012 on the topic of bilingualism and executive control and then determined which of these abstracts were subsequently published. According to their findings, studies with results fully supporting the bilingual advantage hypothesis were the most likely to be published, followed by studies with mixed results, while studies that challenged the bilingual advantage hypothesis were published the least. A funnel plot asymmetry provided additional evidence that confirmed the existence of said publication bias. Compelling, isn’t it? I never failed to cite this paper in my relevant publications.

Unsurprisingly, the paper created a lot of interest: it was picked up by major news sites, policy documents, and Wikipedia pages, while its impact on the scientific community was no lesser (with more than 635 citations according to Google Scholar, and a total of 11389 views and downloads according to the journal’s metrics).

While browsing the Data Colada records, I was thinking of the classification system that de Bruin, Treccani, & Della Sala (henceforth, BTS) employed. It seemed to me that it was not entirely fair to group (as BTS did) negative and null results in one mega-category called ‘results challenging the bilingual advantage hypothesis’ and then juxtapose the publication rates of this category with the category of positive results. Simply put, obtaining a result indistinguishable from zero (i.e. no difference between monolinguals and bilinguals in a given task) is not the same with obtaining a negative result (i.e. a bilingual disadvantage; monolinguals doing better than bilinguals in a given task). Then why group them together?

Despite being puzzled, I realized that this classification preference, although ‘biased’ in my eyes, may be perfectly reasonable for other scholars. I let the BTS paper aside and continued reading Data Colada. After all, I didn’t have much to say about the gist of the BTS paper: yes, if we altered the classification system, the main finding regarding the publication bias against results challenging the bilingual advantage hypothesis was weakened, but why alter it? Just because I think we should? And to what effect? The bias was there: The funnel plot asymmetry provided robust evidence for it. That was enough for me.

That’s when I came across the Data Colada post [58], written by Uri Simonsohn. It bears the title “The Funnel Plot is Invalid Because of This Crazy Assumption: r(n,d)=0”. I started reading the post and I quickly realized that it touches upon the same work that I was reading at that time: the BTS paper. In short, this post takes issue with a critical assumption of BTS: In Simonsohn’s words, “the authors of this 2014 Psych Science paper concluded that publication bias is present in this literature in part based of how asymmetric the above funnel plot is (and in part on their analysis of publication outcomes of conference abstracts). The problem is that the predicted symmetry hinges on an assumption about how sample size is set: that there is no relationship between the effect size being studied, d, and the sample size used to study it, n. […] The assumption is false if researchers use larger samples to investigate effects that are harder to detect […]. It is also false if researchers simply adjust sample size of future studies based on how compelling the results were in past studies.”.

So, the funnel plot asymmetry may not be decisive evidence for a publication bias against challenging results after all? I re-opened my notes on the BTS paper. So far, I had a classification system that didn’t seem sound to me and a funnel plot asymmetry that possibly did not offer evidence for a publication bias, despite the way it was presented in the paper. While such an asymmetry can indeed be caused by a publication bias, it can also be caused by many other reasons, including pure chance. What’s the evidence for the publication bias then?

I turned my attention to the dataset BTS used, which they make available as Supplementary Material. I was looking at the abstracts that they had classified as non-published, when one title caught my attention. I was under the vague impression that I knew this work. How could it be that I know this work if I hadn’t attended that conference and the abstract hadn’t been published? I googled the title, and right in the first result, the published article appeared with a title quite similar to that of the abstract. I checked again the dataset: indeed, this was classified as non-published. Well, there must be an explanation for this, I thought. Perhaps it was published after the BTS cut-off point for inclusion in the dataset? I checked the dates and it sure was. Matter closed.

Since I had evidence in front of me that some of the “non-published” abstracts were eventually published later on, I decided to check how many did make it to the published record. It was a fruitful search: One after the other the “non-published” abstracts returned published outcomes. I didn’t even need to dig deep; usually the publications showed up among the top results Google returned. I decided to update the original BTS dataset. Once I was done, the result was surprising: BTS had excluded from their dataset all conference proceedings, book chapters, dissertations etc. Also, some journal publications were not included too. While BTS do say in the paper that only journal publications were considered, they do not say what happens if all their excluded published outputs are added to the dataset. Long story short, if all these are included, the key effect regarding the publication bias against challenging results disappears, even if the BTS classification system is kept! While several of these published outputs indeed fall outside the BTS cut-off point in 2014, many of them were published within the period of time targeted by BTS, so in principle they could have been included in the dataset. Again, if we include only those within-the-cut-off outputs, the alleged publication bias disappears.

Taking stock, 1. the classification system looked non-optimal to me (and if we change it, the key finding of the BTS paper disappears), 2. the dataset excluded certain types of published outputs without justification (and if we add them, the key finding of the BTS paper disappears), and 3. if we are to trust Simonsohn’s analysis, the funnel plot asymmetry does not provide evidence for the key finding either.

I remember thinking: Is there any evidence left in the BTS paper to support the claim that a publication bias hampers the publishability of results challenging the bilingual advantage hypothesis? I read and reread the paper, and I couldn’t find any. I thought, that’s a very well-known and well-cited paper; probably we should somehow correct the record, if its main finding is not as robust as it is portrayed to be.

I decided to write a paper summarizing my main findings. Around the time I was finishing it, a very esteemed and trusted colleague of mine drew my attention to a Special Issue planned by Behavioral Sciences. The title of the Special Issue is Cognitive and Linguistic Aspects of the Multilingual Advantage. How fitting, I thought. I submitted my paper there. It was published under the title A Classification Bias and an Exclusion Bias Jointly Overinflated the Estimation of Publication Biases in Bilingualism Research, after being read by four reviewers and going through two rounds of review (I chose to make all peer-review history and materials open).

Incidentally, the paper was published during the last days of my pregnancy. As I always do with new publications, I linked the article to my website, and made some brief reference to it in my social media (Facebook and X/Twitter), but that was it. I didn’t have time to send it to colleagues or discuss it with anyone. I knew that the BTS paper was extensively featured in the media, but I didn’t have the bandwidth to even think about promoting my work in similar venues. I just saw it off and went to give birth, without returning to open it again.

I was just back from the hospital with a newborn in my arms, when exactly one month after the day my paper was published, the journal made contact with me. I have seen many things in my years in academia, but I have never received such an unusual email before.

The email started by congratulating me profusely. “The reception of your paper has been truly remarkable, with it being viewed over 1,000 times in a remarkably short period.” the assistant editor wrote me.

Oh, that’s nice, I thought. I paused and went to open the article. Close to 1,100 views. Ok. I resumed reading.

“This accomplishment is certainly praiseworthy, and we extend our heartfelt congratulations to you for this achievement.”. Outlook got excited and added balloons and ribbons to the word ‘congratulations’, an explosion of happy colors and shapes appeared on my screen.

Thank you, you’re very kind. Oh wait, did I win a prize or something for the most read article of the month? Is this why you’re writing?! I continued reading, now with a certain excitement too.

“However, it is important to note that, based on the decision made by the Editorial Board, we would like to propose the removal of your manuscript from the Special Issue titled ‘Cognitive and Linguistic Aspects of the Multilingual Advantage’.”

Huh?! I read that again. And again. And again. Where did this plot twist come from?

“We deeply regret any inconvenience this may cause and understand that you may have been looking forward to its inclusion. Please accept our sincerest apologies for this situation.”.

But what situation? What exactly is the problem? Am I still groggy from the epidural and that’s why I don’t understand anything?

“It’s crucial to emphasize that there is no difference in accessibility between regular and special issues; the primary distinction lies in the fact that special issue papers are grouped together on the special issue website post-publication, making it easier for readers to access related content.”

So, there is no difference in accessibility, but there is a difference in accessibility because, unlike regular papers, Special Issues “make it easier for readers to access related content”?

“We kindly request your feedback on this matter. Your decision in this regard is important to us, and we are open to your preferences.”.

My decision on what? Am I asked to voluntarily remove my paper from a Special Issue after its relevance has been independently assessed by reviewers and handling editors? Also, didn’t you say that a decision has already been made by the Editorial Board? What is left for me to decide?

“We genuinely appreciate your understanding and cooperation, and we apologize for any disruption this may have caused. We eagerly await your response.”

Although I am on maternity leave, I didn’t want to leave them eagerly waiting. I wrote back saying that I would be grateful if they could help me understand the reason behind the desire of the Editorial Board to remove my paper from the Special Issue. On what grounds is this requested post-publication, after the article has been reviewed and assessed for relevance to the specific Special Issue?

After replying, I went back and reread the email. Its first paragraph sang my praises, making explicit reference to how many views the article gathered in such a short period of time. The second paragraph starts with ‘however’, and ‘however’ connects two contrastive ideas: “However, it is important to note that, based on the decision made by the Editorial Board, we would like to propose the removal of your manuscript from the Special Issue…”. What is being contrasted here? What is the connection between the two paragraphs? Who requested the removal of my article from the Special Issue? Sure, the Editorial Board “took the decision”, but prompted by whom?

Five days have passed since I replied to them. They haven’t answered me back. According to my previous experience with this journal, their average response time is less than four hours. I do not have any explanation about what the issue at stake might be. All I can think of is that Special Issues concentrate on a particular theme, hence they are linked to increased visibility on the given topic. I don’t say this; the journal does: “Behavioral Sciences runs special issues to create collections of papers on specific topics. The aim is to build a community of authors and readers to discuss the latest research and develop new ideas and research directions. […] Papers published in a Special Issue will be collected together on a dedicated page of the journal website.” Could it be that my paper, by virtue of criticizing the findings of a very highly regarded study, has created a stir, such that its overall accessibility (i.e. “the remarkable reception” as the journal called it in the email) should be somehow managed? After all, they do highlight in their email that Special Issues make it easier for readers to access related content. I suppose it logically follows that removing a paper from a Special Issue makes it somewhat less easy for readers interested in a specific topic to access this related content.

When I got this “your work has been viewed a lot, now can we please remove it from the Special Issue?” email, I instantly remembered this story by Zoé Ziani, which I had read some days earlier.

Zoé started having suspicions about the validity of the effect reported in one of Gino’s famous and highly cited papers during her PhD. As she says, she didn’t suspect fraud, only cherry-picking. When she shared her concerns, her supervisor dismissed them. Zoé kept digging. And her committee members kept trying to keep her criticism of Gino’s work under wraps: “After the defense, two members of the committee made it clear they would not sign off on my dissertation until I removed all traces of my criticism of CGK 2014 [Casciaro, Gino, Kouchaki; the paper in question]. Neither commented on the content of my criticism. Instead, one committee member implied that a criticism is fundamentally incompatible with the professional norms of academic research. She wrote that “academic research is a like a conversation at a cocktail party”, and that my criticism was akin to me “storming in and shouting ‘you suck’ when you should be saying ‘I hear where you’re coming from but have you considered X’”. The other committee member called my criticism “inflammatory,” and lambasted me for adopting what he called a “self-righteous posture” that was “not appropriate”.”

Zoé eventually caved in, submitting a “censored” version of the dissertation, but also determined to make the cuts publicly available later.

I can’t help wondering how many Zoés exist out there. What happens when you criticize a well-cited paper that is written by prominent scholars with impeccable track records, very prestigious grants, and important publications? If such criticisms stir the waters, is it natural that supervisors, committee members, editorial boards, or senior colleagues with some form of power try to keep them under control? If yes, how is the published record supposed to be corrected?

Let us be very clear: Here we are talking about scientific results. The issue at stake is not one’s view or interpretation of the results, but the results themselves. If I cherry-pick the data I enter in an analysis, it is very likely that I will find a false result and thus report a false fact. This has important repercussions for scientific progress. Charles Darwin understood this well, which is why he chose the following lines to open the epilogue of one of the most influential books of all times:

“Many of the views which have been advanced are highly speculative, and some no doubt will prove erroneous; but I have in every case given the reasons which have led me to one view rather than to another. […] False facts are highly injurious to the progress of science, for they often endure long; but false views, if supported by some evidence, do little harm, for every one takes a salutary pleasure in proving their falseness: and when this is done, one path towards error is closed and the road to truth is often at the same time opened.” (Darwin’s ‘The Descent of Man’, p. 375; emphasis added).

Speaking of the road to truth, it is very possible that walking down this path one obtains surprising or unexpected findings. What do you do with them? As Ziani’s case made amply clear, the incentives to investigate and call out results that are not robust are non-existent. As she puts it, “the opposite is true: If you find something fishy in a paper, your mentor, colleagues, and friends will most likely suggest that you keep quiet and move on (or as I have learned the hard way, they might even try to bully you into silence). If you are crazy enough to ignore this advice, you are facing a Sisyphean task: Emailing authors to share data (which they do want not to), begging universities to investigate (which they do not want to), convincing journals to retract (which they do not want to), waiting months or years for them to share their findings with the public (if it ever happens)…”.

I think (but maybe it’s wishful thinking) that the times are changing. The immense solidarity that the scientific community showed to the Data Colada authors, upon hearing the news of the massive lawsuit brought against them, proves this.

Speaking of lawsuits, in 2003, Barbara Streisand sued a photographer for $50M claiming violation of privacy. The issue at stake was a specific picture of her Malibu property. Interestingly, this picture had been downloaded only a couple of times prior to the lawsuit. After the attempt to censor it, the views skyrocketed in less than a month. Streisand’s intention to control the picture’s accessibility backfired to the point that the Streisand Effect was named after her. It is very likely that none of this would have happened if she simply let the picture exist out there, lost and never found in a sea of information. Probably we wouldn’t even know about it today. Probably you wouldn’t know about my paper in Behavioral Sciences either, if it wasn’t for this post, which entirely owes its existence to the decision of the journal to alter its status post-publication for reasons that remain unknown to me.

In relation to my work, while I still do not know what the journal’s motive is, this decision to remove my paper from the Special Issue is very likely to succeed. Frankly, I do not know how to stop it. Here is the thing though. I cannot stop it, but I can reverse the consequences of such an action.

If the journal alleges a matter of relevance to the Special Issue, this has already been assessed by the reviewers. What will happen to the independence of the peer-review process if editorial boards intervene post-publication and re-assess relevance? Are we burying it somewhere? And anyway, how is a paper on the publication bias behind the bilingual advantage not fitting for a Special Issue on the Cognitive and Linguistic Aspects of the Multilingual Advantage?

If it is a matter of accessibility and visibility, sure, you can remove the article from the Special Issue, and I cannot stop it. But as Barbara would tell you while enjoying a nice piña colada in her Malibu mansion, this is not the best decision ever.

The road to truth has never been an easy one. Colonel Jessup knows.

Photo by Stefan Steinbauer on Unsplash

--

--