Step-by-Step Guide: Introduction to Phylogenetic Analysis with MEGA

Author: Dr. Itunuoluwa Isewon

Email: itunu.isewon@covenantuniversity.edu.ng

You have been provided with a fasta file called Aspergillus18S.fasta and Yeast18S.fasta

📥 Dataset: Download the file for Aspergillus18S here

Download the file for Yeast18S.fasta here

Step 1: Import sequences into MEGA

Launch MEGA X.
File → Open a File/Session… → select Aspergillus18S.fasta.
When prompted, choose Align.
MEGA opens the Alignment Explorer showing your unaligned sequences.

Step 2 — Multiple Sequence Alignment (ClustalW)

In Alignment Explorer:

Align → Align by ClustalW .
Click Options and set/confirm the parameters below, then Compute.

After alignment: • Scroll through; 18S has conserved stems and variable loops—expect gaps mostly in variable regions.

• If the first/last ~10–30 bases are gappy, select and Edit → Delete Selected Sites (or Mask), so they don’t add noise.

Save the alignment: Data → Export Alignment → MEGA format (.meg) and also FASTA for records.

Step 3 — Build a Neighbor-Joining (NJ) tree

Close Alignment Explorer (save when prompted).
In the main MEGA window, choose Phylogeny → Construct/Test Neighbor Joining Tree.
In Analysis Preferences (the panel like your screenshot), set:

Analysis • Scope → All Selected Taxa (or choose a subset beforehand).

• Statistical Method → Neighbor-joining.

Phylogeny Test (support)

• Test of Phylogeny → Bootstrap method.

• No. of Bootstrap Replications → 1000.

Rule of thumb: ≥70% = moderate support, ≥90% = strong.

Substitution Model (Distances)

• Substitutions Type → Nucleotide.

• Model/Method → Maximum Composite Likelihood (MCL).

Alternatives: Tamura-Nei (TN93) or Kimura 2-parameter (K2P); try these if you want sensitivity analysis.

• Substitutions to Include → d: Transitions + Transversions (include both).

Rates and Patterns • Rates among Sites → Uniform Rates (appropriate for conserved 18S). If analyzing more variable loci (e.g., ITS), consider Gamma Distributed; MEGA will estimate the shape (α).

• Pattern among Lineages → Same (Homogeneous).

Data Subset to Use

• Gaps/Missing Data → Pairwise deletion (keeps more sites; good for rRNA with localized gaps). Complete deletion is stricter but can remove lots of signal.

• Select Codon Positions → Tick Noncoding Sites (coding positions have no effect for 18S).

Click Compute to infer the NJ tree.

Step 4 — Inspect, root, and annotate

• Rooting: If your file contains a clear outgroup (e.g., a non-Aspergillus fungus), Tree Explorer → Tree → Root on that taxon. If not, use Midpoint rooting.

• Show bootstrap values: In Tree Explorer, View → Display Option → Show → Bootstrap values.

• Tidy labels: Edit → Replace tip labels can shorten long headers; you kept accession+species, which is perfect.

• Collapse weak nodes: Optionally collapse branches with support <50–70% to simplify.