It is possible to map various sequences onto genomic base sequence (s) and register them as features.
Mapping possible sequence files
- EST / cDNA base sequence file
- EST / cDNA nucleotide sequence file on multiple genome sequences
- Amino acid sequence file
- dbSNP file
- JSNP file
- ABI / SCF trace file
- Tiliing Array probe sequence file
- PCR primer sequence
- Restriction enzyme recognition sequence
Feature mapping is the function of mapping the exported feature data onto the genomic sequence and registering it as a feature of its genomic sequence.
As a method of identifying the position to be mapped, there is a method based on the position information of the feature and a method of identifying the position by homology search using the base sequence of the feature.
EST (cDNA) mapping refers to the mapping of EST (cDNA) sequence file to the reference genome base sequence by using its homology to identify the genomic position from which EST is derived and registering the EST (cDNA) feature at that position.
Depending on the EST library, local redundancy may be high, and many EST features may be mapped to the same site on the genome.
To draw such locally highly redundant EST mapping results on the feature map, there is a feature layout feature called Pack Lane. (For details, please refer to FLS: Feature Layout Style, Pack Lane.)
There are two types of EST mapping.
- Method of mapping on the base sequence displayed in the current main feature map
- Method of designating a plurality of genome base sequences and mapping them to a plurality of specified base sequence files
Trace mapping is a function to map the trace waveform data from the capillary sequencer onto the genomic base sequence from which it is derived.
Using the trace waveform viewer, you can view the aligned trace waveforms.
It is a function to map mutation data of gbSNP or jSNP onto the corresponding genomic base sequence.
Map the amino acid sequence on the reference genome base sequence and register it as a new feature.
It maps the specified amino acid sequence (s) to the reference genome currently displayed as a feature map and registers the hit entry as a new feature (such as mRNA).
For the mapping, the tBlastN algorithm is used.
Mapping results are classified into three types.
- When the input amino acid perfectly matches the reference genome base sequence
- When the input amino acid incompletely matches the reference genome base sequence
- When the input amino acid sequence does not match at all
Results of the former two types can be registered as new features on the reference genome. For example, a perfect match sequence can be registered as an mRNA feature and incomplete match sequences can be registered as miscRNA features.
You can set the same Value for any Qualifier of the newly registered feature. In addition, it is possible to register sequential numbers starting with arbitrary numbers, with the prefix specified in Qualifier / locus_id and any number of digits.
In the case of a eukaryote with an intron, it is registered as a feature by identifying the exon-intron region.
You can limit the maximum base length of introns.
Before performing array analysis, it is necessary to map the tiling array probe to the target genomic base sequence and register the position and information on the genome of each probe as a feature.
Expression data of the array is obtained for each probe, so when loading the expression file, individual expression level information is posted on the features of each probe.
In IMC, Blast search results of each CDS feature identified on the genome sequence can be saved as qualifiers of each feature.
These results of homology search can be written on qualifiers of each CDS feature by mapping the result of Blast search on an external server or the like to the current genome sequence using the Blast search result mapping function.
You can correctly map Blast search results by exporting Feature Key Search results in advance and using them as a query sequence.
This function maps the Read of Next Generation Sequencer (NGS) on the reference genome sequence.
It is implemented in the following software.