Algorithm and data structure
- LGRM uses two closely related genomic sequences.
- One is called the reference genome sequence, and the other is called the comparison control genomic sequence.
- Roughly, mapping is performed on the comparison target genome against the reference genome by the homology of the nucleic acid sequence, the result is pasted on the reference genome, and it is displayed as the LGR lane.
- The flow of analysis is as follows.
- Perform DotPlot analysis between the two genomes and determine the main homologous path.
- We conduct a homology search on the reference genome, using a comparative control genome as a query.
- For this homology search, NCBI BlastN is used.
- Perform a pairwise alignment with the segment on the homologous path obtained from the homology search.
- Record the site of insertion and deletion of comparative control genomic base sequence to the reference genomic base by segment.
- Register each segment as a special Feature Key (lgrm_segment) on the reference genome sequence.
- Each Feature on each segment region of the comparative control genome is registered as the LGRM Feature on the reference genome.
- Reloading the saved LGRM sequence file (GenBank format)
- Whether or not LGRM is specified in Feature Layout Style is judged.
- If LGRM is specified, the sequence lane is displayed in LGRM format and the sequence on lgrm_segment is displayed in the comparison reference base lane.
- If there is an LGR Map lane, display lgrm Feature on the LGR Map lane.
Data holding method
- All analysis results of LGRM are recorded in the Feature section on the reference genome.
- Regions homologous to the reference genome on the comparative control genome are recorded as features belonging to a special Feature Key called lgrm_segment.
- Each feature that is present on the comparison genome and is on a nucleotide sequence region homologous to the reference genome is recorded as a feature on the reference genome. Special features like / lgrm_order are added to features belonging to these comparison genomes.