A Hot Analysis Of The Invasive European Fire Ant

Earlier last year, one of my professors contacted me to see if I would be interested in analyzing fire ant mitochondrial genes. Since there was a bioinformatics component involved, this project was right up my alley.


I always like to start any project by understanding the why. In this case, “why am I analyzing fire ant mitochondrial genes?” Without getting into the specifics (you’ll have to read the full paper for that), the European Fire Ant is an invasive species here in North America. The main goal of the research was to try and understand the mechanism behind it’s dominance. As such, one of the hypotheses was that the infestation in British Columbia was a genetically similar super colony. In other words, all the European Fire Ants in BC were part of one very large ant colony as opposed to several smaller ant colonies.

European Fire Ant Distribution in BC — 2015

My role was to compare the cytochrome b (CYTB) and cytochrome oxidase (COI) mitochondrial gene sequences from local ant species to those of the European Fire Ant to see how they differed. Since the Fire Ant DNA was collected from various parts of BC, I would be able to see any homology within the Fire Ant population with respect to those two sequences. If homology was present, this would provide support for the genetically similar super colony hypothesis.


I was provided with two FASTA files, each containing DNA from ants corresponding to either the CYTB gene or COI gene. Using the NCBI database, I obtained Formica CYTB and COI mitochondrial sequences and added them to the FASTA files. The Formica sequences served as a baseline with which to measure differences in homology.

Formica CTYB Sequence from NCBI Database

I first used several R packages to try to generate some preliminary trees, however this approach did not end up working very well. I ended up using a program called MEGA to generate the phylogenetic trees that I needed.

Mega for Analysis

The first step before creating the trees for the the CYTB and COI genes was to align the sequences in the FASTA files. Mega7 has several tools for sequence alignment such as Clustal and MUSCLE. For this analysis I used MUSCLE with the default settings chosen — Fig A.

Fig A — Sequences aligned using MUSCLE

After alignment, the evolutionary distances for each sequence needs to be determined. MEGA features a number of computational models that allow for these distances to estimated e.g. Jukes-Cantor Model, that utilizes Maximum Likelihood Estimation. Based on the sequences present, MEGA chooses the best possible model.

Fig B — A list of the best possible models generated by MEGA

The Tamura 3-parameter model was selected for both the CTBY and COI sequences — Fig B. This model was used to generate estimation distances for each of the genes.

The last step before visualizing the trees was to determine their reliability. In other words, how well can we trust the evolutionary distances that were computed by our model? By utilizing bootstrap testing we can in fact determine how reliable each node of the tree is. Bootstrap testing was done for 100 iterations.


We finally get to the exciting part which is visualizing the phylogenetic trees! From Fig C we can see that the end result looks all over the place. Note that the bootstrap values are present on the trees as well. In general, if a node has a value over 70 that node can be trusted to be reliable.

Fig C— The COI phylogenetic tree with bootstrap values

By making the Formica sequences the common ancestor aka first node, the ant sequences can diverge from that based on their estimation distances. The end result for both trees can be seen in Fig D.

Fig D— CBTY and COI trees featuring bootstrap values with respect to Formica

To prevent intentional/unintentional bias I was not told which genes were European Fire Ant and which were the native BC ants. Rest assured, there will be an upcoming article about that along with a discussion of the rest of the paper. Thanks for reading!