Genome Survey Sequence
Efficient, Accurate, Fast
Genome survey, as the name indicates, means a rapid characterization of a genome, where very limited size of small-fragment library is sequenced in low depth. With help of K-mer analysis, genome survey can provide information including genome size, heterozygosity, repetition rate, which are crucial in determining sequencing strategy for whole genome de novo sequencing.
Principle of Genome Survey
Initial characterization of genome by genome survey
- Repetitive ratio
- GC content
Application of Genome survey
Providing basic characterization of a genome and estimate difficulties in genome assembly.
Guiding design of library construction and sequencing strategy of large-scale de novo genome sequencing.
Revealing differences in genome between related species.
Genome Survey with Biomarker Technologies
Clear K-mer frequency statistics. Accurate estimation on genome size, heterozygosity, repetitive ratio, etc.
Over 1,000 genome survey completed. Accumulated experience of over 300 species, covering forest, marine organisms, animals, plants, etc.
Contributed in many high impact genome publications.
1What is de novo genome sequencing?
Answer: De novo genome sequencing refers to sequencing of a novel genome without reference. It enables construction of genome for novel species and updating existing reference genome. The whole process include DNA library construction, sequencing and reads assembly, annotation with bioinformatic tools.
2What are the advantages of TGS-based genome over NGS-based genome?
Answer: Third generation sequencing is characterized by its long reads at average length of 10-15 kb. The read length of NGS is PE125-250 bp. Therefore, assembling NGS reads can be problematic, especially for repetitive sequences and heterozygous region. With long reads which could possibly cross these complicated regions, TGS sequencing data largely improved the quality of genome assembly.
3TGS is also known for its higher error rate. Is it still suitable for genome sequencing?
Answer: The known error rate refers to errors in base calling, which can be corrected by increasing sequencing depth. Data with 30x coverage can achieve above 99.99% accuracy in single base. Therefore, TGS data is completely suitable for genome assembly.
4How to choose samples for genome sequencing?
Answer: Samples for genome sequencing should be sampled from the same organisms as genome survey. For plant sample, bud cultures, fresh leaves without contamination is recommended. For animals, whole blood and viscera are recommended.