Sometimes Inaccuracy Helps

KAM
APBioNET Bioinformatics
1 min readMar 6, 2021

--

Photo by Casey Horner on Unsplash

Revisiting something interesting that someone shared back in 2010, with the same title as above, and it was based on a Nature article (Ref.)

“…, we noticed errors in the software source code used to create the initial BLOSUM family of matrices…The result of these errors is that the BLOSUM matrices — BLOSUM62, BLOSUM50, etc. — are quite different from the matrices that should have been calculated using the algorithm described by Henikoff and Henikoff.”

“This case is noteworthy for three reasons: first, the BLOSUM matrices are ubiquitous in computational biology; second, these errors have gone unnoticed for 15 years; and third, the ‘incorrect’ matrices perform better than the ‘intended’ matrices.”

“Surprisingly, ‘fixing’ the matrices does not improve performance”

This has stuck with me to date. Just can’t imagine the repercussions to biology if the error had significantly impacted homology search results and interpretations. Imagine, re-evaluating the work of over the 15 years that it was not detected.

As the authors errors state,

“software errors are quite common and nothing special”,

but it is the wide-use of BLOSUM matrices, especially in BLAST search, that was nerve-wracking.

What a relief!

©2022 KAM

--

--