Exploring the Genomic Features of the C. illinoinensis Mitochondrial Genome
Genomic Features of C. illinoinensis Mitochondrial Genome
The mitochondrial genome of C. illinoinensis was sequenced and submitted to GenBank (accession number PRJNA824975). High-quality assembly and annotation were achieved using second- and third-generation sequencing methods.
Sequencing Results
-
Illumina Sequencing:
- Reads: 19,113,015
- Minimum Coverage: 52x
- Average Coverage: 355.4x
- Nanopore Sequencing:
- Reads: 1,629,169
- Average Read Length: 6,007 bp
- N50: 10,756 bp
The mitochondrial genome is circular, measuring 495,205 bp. Its nucleotide composition is:
- A: 27.32%
- T: 27.70%
- G: 22.52%
- C: 22.46%
- GC Content: 44.98%
The genome contains 64 annotated genes:
- 37 protein-coding genes (PCGs)
- 24 tRNA genes
- 3 rRNA genes
Protein-Coding Genes
The 37 PCGs are categorized into 10 groups. All PCGs begin with the start codon ATG. The stop codons used are:
- TAA: 54.05% (20/37)
- TGA: 32.43% (12/37)
- TAG: 13.51% (5/37)
The genome includes 10 intron-containing genes. Notably:
- ccmFC, cox2, rps2, rps19, rps4: 1 intron each
- nad4: 3 introns
- nad1, nad2, nad5, nad7: 4 introns each
tRNA Gene Structures
The genome contains 24 distinct tRNA genes responsible for transporting 20 amino acids. All tRNAs can form cloverleaf secondary structures. Five tRNAs exhibit variable regions creating stem loops.
Consensus bases analysis showed:
- Acceptor arm and D-arm predominantly contain G nucleotides.
- The anticodon arm is typically 5 bp long, and anticodon loops have 7 bp.
Repeat Sequence Analysis
A total of 447 dispersed repeats were identified, categorized as:
- Forward: 241 (53.91%)
- Palindromic: 201 (44.97%)
- Reverse: 2 (0.45%)
- Complementary: 3 (0.67%)
Most repeats are 30-39 bp long and are primarily concentrated in intergenic spacers.
SSR Analysis
432 simple sequence repeats (SSRs) were found:
- Monomers: 162 (37.50%)
- Dimers: 189 (43.75%)
- Trimers: 22 (5.09%)
- Tetramers: 54 (12.50%)
- Pentamers: 18 (1.16%)
Over 81% of SSRs are monomeric or dimeric.
Codon Preference
The mitochondrial genome encodes 9,960 amino acids. The most common amino acids are:
- Ser: 905 (9.09%)
- Leu: 834 (8.37%)
- Ile: 750 (7.30%)
The most frequently used codons include UUU (Phe), AUU (Ile), and UUC (Phe).
RNA Editing Sites Prediction
482 RNA editing sites were predicted in 37 PCGs. Most are positioned at the codon’s second position. Notably, 69.29% of the RNA editing events lead to amino acid conversions, with leucine being preferred after editing.
DNA Migration Analysis
The mitochondrial genome is 3.08 times longer than the chloroplast genome. Approximately 8.74% of the mitochondrial genome consists of fragments migrated from the chloroplast.
Phylogenetic Analysis
A phylogenetic tree constructed from 36 conserved mitochondrial PCGs shows a strong relationship between C. illinoinensis and Fagus sylvatica. All nodes display significant bootstrap support, indicating robust evolutionary connections.
This information serves as a foundation for further studies on the phylogenetic relationships within Carya species.
