Step-by-Step Guide: Introduction to Phylogenetic Analysis with MEGA
Author: Dr. Itunuoluwa Isewon
Email: itunu.isewon@covenantuniversity.edu.ng
You have been provided with a fasta file called Aspergillus18S.fasta and Yeast18S.fasta
π₯ Dataset: Download the file for Aspergillus18S here
Download the file for Yeast18S.fasta here
Step 1: Import sequences into MEGA
- Launch MEGA X.
- File β Open a File/Sessionβ¦ β select Aspergillus18S.fasta.
- When prompted, choose Align.
- MEGA opens the Alignment Explorer showing your unaligned sequences.
Step 2 β Multiple Sequence Alignment (ClustalW)
In Alignment Explorer:
- Align β Align by ClustalW .
- Click Options and set/confirm the parameters below, then Compute.
After alignment: β’ Scroll through; 18S has conserved stems and variable loopsβexpect gaps mostly in variable regions.
β’ If the first/last ~10β30 bases are gappy, select and Edit β Delete Selected Sites (or Mask), so they donβt add noise.
Save the alignment: Data β Export Alignment β MEGA format (.meg) and also FASTA for records.
Step 3 β Build a Neighbor-Joining (NJ) tree
- Close Alignment Explorer (save when prompted).
- In the main MEGA window, choose Phylogeny β Construct/Test Neighbor Joining Tree.
- In Analysis Preferences (the panel like your screenshot), set:
Analysis β’ Scope β All Selected Taxa (or choose a subset beforehand).
β’ Statistical Method β Neighbor-joining.
Phylogeny Test (support)
β’ Test of Phylogeny β Bootstrap method.
β’ No. of Bootstrap Replications β 1000.
Rule of thumb: β₯70% = moderate support, β₯90% = strong.
Substitution Model (Distances)
β’ Substitutions Type β Nucleotide.
β’ Model/Method β Maximum Composite Likelihood (MCL).
Alternatives: Tamura-Nei (TN93) or Kimura 2-parameter (K2P); try these if you want sensitivity analysis.
β’ Substitutions to Include β d: Transitions + Transversions (include both).
Rates and Patterns β’ Rates among Sites β Uniform Rates (appropriate for conserved 18S). If analyzing more variable loci (e.g., ITS), consider Gamma Distributed; MEGA will estimate the shape (Ξ±).
β’ Pattern among Lineages β Same (Homogeneous).
Data Subset to Use
β’ Gaps/Missing Data β Pairwise deletion (keeps more sites; good for rRNA with localized gaps). Complete deletion is stricter but can remove lots of signal.
β’ Select Codon Positions β Tick Noncoding Sites (coding positions have no effect for 18S).
- Click Compute to infer the NJ tree.
Step 4 β Inspect, root, and annotate
β’ Rooting: If your file contains a clear outgroup (e.g., a non-Aspergillus fungus), Tree Explorer β Tree β Root on that taxon. If not, use Midpoint rooting.
β’ Show bootstrap values: In Tree Explorer, View β Display Option β Show β Bootstrap values.
β’ Tidy labels: Edit β Replace tip labels can shorten long headers; you kept accession+species, which is perfect.
β’ Collapse weak nodes: Optionally collapse branches with support <50β70% to simplify.
Step 5 β Export & document
β’ File β Export Current Tree (Newick) β Aspergillus18S_NJ.nwk.
β’ Image export: File β Export Image (PNG/PDF/SVG) at 300β600 dpi.
β’ Save your .meg project so you can reopen without re-aligning.
Now try to repeat this process using the yeast18S.fasta