Crap spotted at bioRxiv
Following my post it appears bioRxiv is trying to silence the author whose work I criticize. To make sure he has a voice I am sharing the link to his blog: https://sanchakblog.wordpress.com/
An excellent in-depth overview of biology preprints recently appeared online, and one of the concerns noted for preprints was that “they’ll wind up filling the internet with crap”. Right on cue, I’ve spotted some crap at bioRxiv.
To be sure, there is crap science everywhere you look, peer-reviewed or not, and I’ve noticed crap at preprint servers before. So then why am I highlighting this crap? This crap hit close to home because it is an attempt at post-publication peer review (PPPR), which I am trying to make a career out of. By performing PPPR on this attempt at PPPR, I hope to show the scientific community that crap preprints are not something to be afraid of as there is a simple way to deal with them — read them and review them.
I also don’t want the tone police like Susan Fiske to have any ammunition against us methodological terrorists. One criticism of public PPPR, whether it is via blogs or preprints, is that it is being done by amateurs instead of experts, and as a result is error-prone. By calling out this attempt at PPPR I want to make it clear that poor quality PPPR will not be tolerated and will be criticized as harshly as poorly conducted original research.
There are actually 3 preprints all criticizing the same article, all by the same author, Sandeep Chakraborty. They are here, here, and here. These 3 preprints are some of the worst work I have ever come across, and that is saying something since literally all I do is read bad work.
I will thoroughly rebut these articles starting with the easiest to identify issues, and ending with detailed point by point errors made by the author. I will then discuss how you should perform PPPR and how bioRxiv should deal with pseudoscience and problem authors.
It doesn’t matter if you don’t have a high school diploma, just looking at these preprints should have raised immediate red flags.
Let me show you the meat of the first preprint:
That’s it, one paragraph. The paper the preprint is supposedly “taking down” sequenced hundreds of samples and trained machine learning classifiers with hundreds of genes. And yet Chakraborty thinks looking at the expression of a few genes is enough to show the paper is worthless? That shouldn’t pass the smell test.
The other two preprints are similarly brief, or actually even more so if you only consider unique text:
No, I didn’t accidentally post the same image twice, check the DOIs.
Look, I don’t really care about self-plagiarism, but when there is more recycled text than text in the results section that’s just ridiculous. These 3 preprints clearly should have been 1 preprint. And even then, the preprint would have been ridiculously short. Here is an example of the appropriate length for a critique. Posting multiple, nearly identical, manuscripts makes it annoying to review and comment on the articles.
And there’s something I’ve never seen before. The author submitted a revision to the first preprint, but changed the title.
Again, check the DOIs. Who does this? This is closer to a completely different manuscript than a revision. Why is this allowed? Does Google Scholar handle this?
Another concerning aspect about the revision is that the table and 4 figures are missing. Why were they removed? Did the author realize he had made a mistake in the first version? If so, why is he tweeting at everyone about how sure he is in his results?
I can’t believe these preprints passed the screening process at bioRxiv, but they did, so here we are.
Anyone with a background in biology should be able to see the main criticism of the author doesn’t make any sense. I’ll just copy from his conclusion so I’m not accused of misrepresenting anything:
This raises serious doubts on using TEP as a possible ‘liquid biopsy’ candidate. Essentially, it refutes the hypothesis that platelets carry enough RNA-seq from tumors to make it viable as a diagnostic method.
Basically the author seems to be under the impression that tumor-educated blood platelets (TEPs) are picking up tumor RNA, and it is this RNA which is being used as biomarkers. The author rejects the viability of this method when he can’t find known tumor biomarkers in the provided RNA-SEQ data, and thus comes to the conclusion the study is fatally flawed.
However, the study is not using a few RNA biomarkers from platelets, it is using hundreds of different genes. The study only depends on the fact that the platelet RNA is somehow perturbed, it doesn’t really matter how. As a result, the author’s main criticism is less than worthless.
The devil’s in the details
Although these preprints were highly unprofessional and were based on what I like to call “fuzzy science”, it is still possible that Chakraborty made some important points somewhere, somehow. I’ll show that this is unlikely given that the preprints contain numerous errors and misunderstandings, underscoring that the author is unqualified to be critiquing the study, which is compounded by the fact that he made little to no effort to understand the papers or perform meaningful analyses.
In the first paper, Chakraborty provides a table with MET expression counts of some of the NSCLC platelet libraries:
Chakraborty compares the counts to a data set from an unrelated paper for a solid tumor. He seems to find the differences in expression concerning, presumably because in the original study some NSCLC patients were identified to have MET-overexpression, specifically 8 patients:
However, the MET-overexpression in this table is not referring to the expression in the platelets as Chakraborty assumes, but rather to the expression in the solid tumors. The methods make this clear:
Tumor tissues of patients were analyzed for the presence of genetic alterations
by tissue DNA sequencing, including next-generation sequencing SNaPShot,
assessing 39 genes over 152 exons with an average sequencing coverage of
>500, including KRAS , EGFR , and PIK3CA (Dias-Santagata et al., 2010).
Assessment of MET overexpression in non-small cell lung cancer FFPE slides was performed by immunohistochemistry
I don’t notice anything in the study that suggests platelets should contain these tumor biomarkers, and in fact the paper suggests the exact opposite, namely that platelets contain low levels of these markers:
We selected platelet samples of patients with distinct therapy-guiding markers conﬁrmed in matching tumor tissue. Although the platelet mRNA proﬁles contained undetectable or low levels of these mutant biomarkers…
As a result, the low expression counts of MET that Chakraborty listed in his table is entirely consistent with everything written in the original study, and is not concerning in the slightest.
Really we could stop here, but we’re not going to, because bad science pisses me off.
As mentioned above, in the revision the table was removed. The text was altered to state:
Both MET-overexpression and EGFR mutations use FFPE, which are solutions ‘designed to meet the challenges of analyzing degraded or limited genomic material’ (https://www.illumina.com/science/education/ffpe-sample-analysis.html).
It appears Chakraborty now realizes that the MET-overexpression mentioned in the study did not refer to the platelet data, but it still doesn’t seem he realizes that immunohistochemistry was used given the link to Illumina, a sequencing company, and the mention of “genomic material”.
Since Chakraborty is no longer providing a table of MET expression, he altered the conclusion a bit. Let’s go through it line by line.
This study raises serious doubts on using TEP as a possible ‘liquid biopsy’ candidate. Essentially, it refutes the hypothesis that platelets carry enough RNA-seq from tumors to make it viable as a diagnostic method. This has been vaguely worded in the TEP-study — ‘contained undetectable or low levels of these mutant biomarkers’,
Instead of providing his own evidence for low expression of MET, it appears he is now just relying on the words of the original authors, and realizes now the platelets are meant to have low levels of these markers.
suggesting that other mRNA (”surrogate signatures”) might encode enough information for cancer diagnostics. Here, it is shown in details that most of the 1072 discriminator genes make no sense.
This is just delusional, he discussed the counts for a couple genes. How is that showing “most of the 1072 discriminator genes make no sense”?
The onus lies on the authors of the study to show at least one gene that is diﬀerentially regulated in proximity to tumor cells to prove some sort of biological relevance.
The fact that machine learning can use the expression of hundreds of genes to create a classifier in no way suggests there will be a single gene with known biological relevance that is differentially regulated. The naivety of the author is on full display here.
With this knowledge of the first preprint, the second preprint is fairly humorous. The author searches for an explanation of the unusual MET overexpression in some tumor-educated platelet RNA-SEQ samples. But, as we know, the MET overexpression just refers to the immunohistochemistry of the solid tumors. In fact, every sample Chakraborty looked at in his first preprint had a low MET expression. Shouldn’t he confirm the phenomenon exists before trying to find an explanation for said phenomenon? He is chasing ghosts. You should go read his just-so story, it’s great.
The third preprint goes over something called a Kappa statistic. I don’t really know what this is, or if the author’s concerns are valid. Even if his concerns are valid, it doesn’t really matter since they only seem to concern the classification of EGFR mutant status, which is a minor part of the study.
Perhaps Chakraborty will not agree with any of the above, so I am going to cover a blatant error by him which shows the carelessness of his critique and his limited understanding of the paper. And in doing so I’m going to show how post-publication peer review is performed by a professional.
Chakraborty writes in the first preprint revision:
‘Overexpression of MET protein in tumor tissue relative to adjacent normal tissues occurs in 25–75% of NSCLC and is associated with poor prognosis’…However, in the TEP-study only 13% (8 out of 60) are MET+, which seems too low (Table S1).
I looked at Table S1, there are only 24 NSCLC samples that underwent immunohistochemistry (indicated by the listing of either MET+ or MET WT). As a result, the percentage of MET overexpression is 8 out of 24, or 33%, which falls in Chakraborty’s desired range of 25–75%.
Okay, so you might remember Table 1 above from the original study, which seems to suggest the incidence of MET overexpression is 8 out of 60, as Chakraborty claims. This is what separates amateurs from professionals. When confronted with conflicting information you need to look for ways to resolve the discrepancy.
Let’s first repost Table 1 so we are on the same page:
In addition to this table, Figure 3 provides data for these mutations. Let’s take a look:
What we’re interested is the “Actual Class” KRAS mutant, which is 26 samples, with 60 total. Going back to Table 1, there are 15+11=26 KRAS mutants. So far so good.
Let’s do another:
Figure 3E claims there are 21 EGFR mutants, with 60 total, and Table 1 claims there are 14+7=21 mutants, great.
Now let’s get to the money shot:
I was expecting to see 8 mutants, with 24 total, but there are 8 mutants with 23 total. Is this a typo?
To make matters worse, in Table 1 the percents suggest the divisors sum to 60, instead of 24, or 23. That is explainable if Table 1 is not showing incidence, but is showing presence, which is exactly what it is labeled as.
So how are we going to figure out this 24 vs 23 problem?
Email the authors.
That’s right, just ask the authors what’s up.
I got a prompt response that they had received the status of one of the MET WT samples after they had already made the figure, which is why the sample size is 1 less.
As a result, if Chakraborty had carefully read the paper he would have realized the incidence was 8 out of 24 instead of the 8 out of 60 he reported, an incontrovertible error.
Not only does he not grasp the basic facts of the paper he is critiquing, but he clearly lacks knowledge of the related biology and machine learning techniques. If he was a graduate student covering this paper for journal club I would be snickering in the back of the room. His critiques amount to nothing more than pseudoscience.
How post-publication peer review should be performed
Why am I so upset? Just put yourself in the shoes of the original authors, it’s not hard.
I constantly get emails from people who think they found a problem with OncoLnc. Most of these questions are stupid, but I quickly respond and try to explain to them why they are seeing what they are seeing.
But instead of emailing me imagine if these people went and posted 3 preprints on bioRxiv about how everything in OncoLnc is wrong. And then a bunch of people on Twitter tweeted about the preprints without even reading them. And then once I responded to the preprints people on Twitter criticized my response without ever reading the preprints, my original paper, or my response.
I would be pissed.
When did scientists become sheep? Can you read the work before commenting on it please?
With that said, post-publication peer review is extremely important and should be done in a public setting. But one thing people have to remember about public scientific arguments is it’s basically a game of poker, as Andrew Gelman brilliantly describes. But instead of money, scientific reputations are at stake.
When someone takes a position they are putting their reputation on the line. As a result, publicly posted critiques should be done as carefully as possible. If Chakraborty was concerned about this paper here’s what he should have done: replicated the study.
All the sequencing data is publicly available.
If he really believes the method is flawed then he shouldn’t be able to generate a classifier to differentiate healthy and cancerous samples. Or if he thinks the authors’ model suffers from overfitting and is not robust, then he should have shown the classifier doesn’t work on independent data sets.
But he didn’t do any of that. He had absolutely no comprehension of the study, and instead of emailing the authors his concerns and getting some clarity, he posted 3 short preprints that are in desperate need of a copy editor. He then went on Twitter to try and sound the alarms about the study. He basically committed a scientific crime.
I imagine some people might claim it’s too much to ask that critics reproduce the results of a study. But if you don’t have the knowledge and skill to replicate the analyses of the original authors, you are not qualified to criticize the work. And if you aren’t going to put in the effort then you don’t deserve to attack someone’s scientific reputation.
Going back to the poker analogy, I guess I have quite a few chips, as I’ve criticized a paper that was subsequently retracted, and am responsible for the Wansink investigation which is one of the larger scientific scandals in the world right now. I don’t know how many chips Chakraborty has since I don’t know if he’s been in any scientific disputes before. I suppose we’re both all-in, but I’m holding the nuts.
What should bioRxiv do?
Whether bioRxiv likes it or not, it is a brand, just like Nature and Cell are brands.
So when people hear a critique has been posted to bioRxiv, they can’t help but assume it is legitimate:
Obviously it is wrong to assume papers in specific journals are correct. The brand people should be looking at is the author, not the journal. If Brian Wansink publishes a Nature paper tomorrow everyone should assume it’s complete bullshit.
As a result, if you aren’t familiar with the author then you should be skeptical regardless of where the paper is published.
The fact that bioRxiv is not a journal is actually a positive, as it allows anyone to get exposure for their work. But just because bioRxiv doesn’t pass judgement on the science of papers doesn’t mean they can’t have standards.
Can we make sure preprints have more than one paragraph of results? Can we make sure people don’t post duplicate articles? Is that so hard?
I understand bioRxiv can’t prevent pseudoscience from getting past their screening process, but the question then becomes: what does bioRxiv do once it’s discovered a paper is pseudoscience?
Should they retract the paper? The way bioRxiv currently retracts papers is to scrub their existence from the web, leaving only a broken link behind. If that is how they are going to continue to handle retractions that isn’t the answer. If all the pseudoscience of an author is deleted from the web how will we know they are a pseudo-scientist? I would rather the articles remain up as a type of scarlet letter.
So then what do they do? I think they need to make it easy to see that an author has a history of shit posting, in the words of Reddit. At a minimum this would mean expressions of concerns would be clearly posted on the shit posts, and ideally the author’s account would also be flagged. Perhaps the author’s future preprints should go through a more extensive screening process.
Does that sound extreme? I don’t know, Chakraborty has 9 preprints at bioRxiv, 3 of which have been confirmed to be shit posts. What are the chances the other 6 are also shit posts? What are the chances he will shit post again?
Look, I’m all for democratizing science and giving everyone a voice, so I don’t mind letting people shit post to their heart’s content, but we also need to flag these so they aren’t mistaken for actual research.
So what does bioRxiv do then? They clearly need some editors to handle scientific disputes, specifically accusations of shit posting. Will that cost money? Of course, but what are they doing with that Chan Zuckerberg funding?
Currently it appears that John Inglis is handling all scientific disputes, and is only retracting papers in extreme circumstances. As the number of preprints continues to increase, with the inevitable concurrent increase in shit posts, they’ll need more hands on deck. And if they don’t start flagging shit posts preprints might become synonymous with pseudoscience.
P.S. This was meant as a PPPR of the bioRxiv preprints, not of the original study. I do not know if the Cancer Cell paper is correct, but I have no reason to believe it is wrong. The article appears carefully written and the authors promptly cleared up a discrepancy I had identified between Figure 3F and Table S1. The authors also should be commended for replying to the bioRxiv critiques despite the fact they are basically just gibberish. I’m sure the authors must have been flabbergasted that critiques of this quality were allowed to be posted, and it is experiences like this that could engender negative connotations about preprints.
To determine if the analyses in the paper are correct I would have to perform an independent replication, but I already have a backlog of data to analyze and this paper isn’t directly related to my research, so I don’t really care if it is correct or not. Hopefully some of the 92 people who have cited the work on Google Scholar have confirmed the method to be reliable.
I wish amateurs would leave PPPR to the professionals, we already have our hands full. We don’t need to be cleaning up your mess as well.