Author: Damian Kao; Alvina G. Lai; Evangelia Stamataki; Silvana Rosic; Nikolaos Konstantinides; Erin Jarvis; Alessia Di Donfrancesco; Natalia Pouchkina-Stantcheva; Marie Sémon; Marco Grillo; Heather Bruce; Suyash Kumar; Igor Siwanowicz; Andy Le; Andrew Lemire; Michael B. Eisen; Cassandra Extavour; William E. Browne; Carsten Wolff; Michalis Averof; Nipam H. Patel; Peter Sarkies; Anastasios Pavlopoulos; A. Aziz Aboobaker
Title: The genome of the crustacean Parhyale hawaiensis: a model for animal development, regeneration, immunity and lignocellulose digestion Document date: 2016_7_25
ID: 57sp9d9l_1
Snippet: 5.4kb, similar to intron size in H. sapiens (5.9kb) but dramatically longer than introns in D. pulex (0.3kb), 208 D. melanogaster (0.3kb) and C. elegans (1kb) ( Figure 5B ). 209 For downstream analyses of Parhyale protein coding content, a final proteome consisting of 28,666 210 proteins was generated by combining candidate coding sequences identified with TransDecoder [57] from 211 mixed stage transcriptomes. Almost certainly the high number of .....
Document: 5.4kb, similar to intron size in H. sapiens (5.9kb) but dramatically longer than introns in D. pulex (0.3kb), 208 D. melanogaster (0.3kb) and C. elegans (1kb) ( Figure 5B ). 209 For downstream analyses of Parhyale protein coding content, a final proteome consisting of 28,666 210 proteins was generated by combining candidate coding sequences identified with TransDecoder [57] from 211 mixed stage transcriptomes. Almost certainly the high number of predicted gene models and proteins is 212 an overestimation due to fragmented genes, very different isoforms or unresolved alleles, that will be 213 consolidated as annotation of the Parhyale genome improves. We also included additional high 214 confidence gene predictions that were not found in the transcriptome ( Figure 4C ). The canonical 215 proteome dataset was annotated with both Pfam, KEGG, and BLAST against Uniprot. Assembly quality which are fragments that do not contain a large ORF, also mapped to the assembled genome. Together 221 these data suggest that our assembly is close to complete with respect to protein coding genes and 222 transcribed regions that are captured by deep RNA sequencing.
Search related documents:
Co phrase search for related documents- Assembly quality and protein coding gene: 1
- coding content and high number: 1
- coding content and Parhyale genome: 1
- coding content and protein coding content: 1, 2
- coding content and protein coding gene: 1
- coding gene and confidence gene: 1, 2
Co phrase search for related documents, hyperlinks ordered by date