Datasets and codes for repeats annotation in genome
Main Author: | Zeng, Lu |
---|---|
Other Authors: | Kortschak, Dan, Raison, Joy, Bertozzi, Terry , Adelson, David |
Format: | Dataset |
Terbitan: |
Mendeley
, 2018
|
Subjects: | |
Online Access: |
https:/data.mendeley.com/datasets/k88h5xnhcb |
Daftar Isi:
- 1) Files begin with 'all_retrovirus.' in 'report_run' are data we used to identify retrovirus (see detail in supplementary 1.5.3); 2) Files begin with 'GB_TE.new' in 'GBTE_data' are index files we used to identify reverse transcriptase and TE sequences from NCBI (see detail in supplementary 1.5.3); 3) 'report_run' are codes used to run reportsJ.pl (see detail in supplementary 1.5.3); 4) Files begin with 'sprot.' in 'report_run' are index files we used to identify proteins (see detail in supplementary 1.5.3); 5) 'Vertebrate_use.fa' is Vertebrate repeat consensus sequences downloaded from Repbase, we used it as CENSOR library (see detail in supplementary 1.5.1); 6) 'our_known_reps_20130520' was used in the first CENSOR run (see detail in supplementary 1.5.1). 7) 'RepBase20.04.fasta' used in last step of TE annotation, contains CENSOR TE references