Python Package Arnie for Interacting with RNA Structures

Abish Pius
Computational Biology Papers
3 min readDec 9, 2023
RNA

Installing Arnie: Your Gateway to RNA Interaction

Arnie, a versatile utility library, serves as the foundation for seamless interaction with various RNA secondary structure prediction packages. Installing Arnie is a breeze; a simple pip install arnie command in your terminal will have you ready to embark on your RNA exploration journey.

!pip install arnie

Now that Arnie is at your disposal, it’s time to equip it with a robust secondary structure predictor. Enter EternaFold — a leading prediction package that has been trained using sequences gathered through the citizen science game Eterna. To integrate EternaFold with Arnie, follow these steps:

!conda config --set auto_update_conda false
!conda install -c bioconda eternafold --yes

With EternaFold successfully installed, you’ve laid the groundwork for predicting RNA secondary structures.

Making Predictions: The Power of Minimum Free Energy

Let’s put this newfound power to the test by predicting the secondary structure of a Hammerhead ribozyme sequence. Leveraging Arnie’s mfe function, you can obtain the Minimum Free Energy structure, presented in the widely used "dot-bracket" notation.

from arnie.mfe import mfe
sequence = "CGCUGUCUGUACUUGUAUCAGUACACUGACGAGUCCCUAAAGGACGAAACAGCG"
structure = mfe(sequence, package="eternafold")
print(structure)

The resulting structure will be displayed in dot-bracket notation, where unpaired bases are denoted by “.”, and paired bases are represented by “()”.

Visualization: Bringing RNA Structures to Life

To enhance the visual interpretation of RNA structures, let’s introduce a visualization tool called draw_rna. This Das Lab tool allows you to plot RNA structures in 2D, offering a more intuitive representation.

!pip install draw_rna
from draw_rna.ipynb_draw import draw_struct
draw_struct(sequence, structure)

Now, instead of deciphering dot-bracket notation, you can effortlessly interpret and communicate RNA structures in a visually appealing manner.

Beyond Minimum Free Energy: Exploring Base Pair Probability

Arnie doesn’t stop at Minimum Free Energy predictions. It also enables you to delve into the intricate details of RNA structures by generating a ‘Base Pair Probability’ matrix. This matrix predicts the likelihood of every possible base pairing, providing a comprehensive view of the structural possibilities within a given sequence.

from arnie.bpps import bpps
bpps(sequence, package="eternafold")

This matrix opens up a new dimension of analysis, allowing you to explore the nuanced probabilities associated with base pairings in the RNA sequence.

The RNA Science package, with Arnie and EternaFold at its core, empowers computational biologists and researchers to unravel the mysteries of RNA structures. From predicting Minimum Free Energy structures to visually representing them in 2D, this toolkit opens up a world of possibilities for understanding the intricate dance of RNA molecules within living cells. As you embark on your RNA science journey, let this guide serve as your compass, navigating you through the installation and utilization of these powerful tools.

  • Parts of this article were written using Generative AI
  • Subscribe/leave a comment if you want to stay up-to-date with the latest AI trends.

Plug: Checkout all my digital products on Gumroad here. Please purchase ONLY if you have the means to do so. Use code: MEDSUB to get a 10% discount!

--

--

Abish Pius
Computational Biology Papers

Data Science Professional, Python Enthusiast, turned LLM Engineer