Author: Wang, Yanli; Addess, Kenneth J.; Chen, Jie; Geer, Lewis Y.; He, Jane; He, Siqian; Lu, Shennan; Madej, Thomas; Marchler-Bauer, Aron; Thiessen, Paul A.; Zhang, Naigong; Bryant, Stephen H.
Title: MMDB: annotating protein sequences with Entrez's 3D-structure database Document date: 2006_11_29
ID: 6qpsxmgi_6
Snippet: In the Entrez database system, protein sequences are neighbored to each other by comparing each newly entered sequence to all other database entries. These database scans are run with the BLAST (5) engine, which identifies sequence neighbors with significant similarity, and the resulting sequence identifiers and taxonomy indices are stored, so that Entrez can provide 'Related Sequences' links for all protein records in the collection. The 'Relate.....
Document: In the Entrez database system, protein sequences are neighbored to each other by comparing each newly entered sequence to all other database entries. These database scans are run with the BLAST (5) engine, which identifies sequence neighbors with significant similarity, and the resulting sequence identifiers and taxonomy indices are stored, so that Entrez can provide 'Related Sequences' links for all protein records in the collection. The 'Related Structure' service is built on top of this system. Sequence neighbors directly linked to MMDB are identified and alignments are recomputed by employing the 'BlastTwoSequences' tool (9) to restore alignment footprints. The 'Related Structure' web interface provides direct access to this information. Initially this service had been restricted to sequences from microbial genomes (10), but it has now been expanded to cover all proteins in Entrez and is updated daily to provide a comprehensive 3D-structure annotation service. Identification of structure-linked neighbors and the visualization of sequencestructure alignment is also possible using Entrez and the Cn3D alignment viewer/editor, but 'Related Structures' provides a convenient new summary and 'one click' shortcuts to 3D visualization. These 3D views may be used to identify conserved residues and map site-specific features derived from the 3D structure. Currently 48% of non-identical protein sequences in Entrez have been linked to at least one related structure, employing a conservative threshold for alignment length (50 aligned residues or more) and similarity (30% or more identical residues in the aligned footprint); see Figure 1 for details.
Search related documents:
Co phrase search for related documents- aligned footprint and protein sequence: 1
- alignment footprint and Entrez protein: 1
Co phrase search for related documents, hyperlinks ordered by date