- RepeatsDB 2.0: improved annotation, classification, search and visualization of repeat protein structures. [PMID: 27899671]
Lisanna Paladin, Layla Hirsh, Damiano Piovesan, Miguel A Andrade-Navarro, Andrey V Kajava, Silvio C E Tosatto
Nucleic acids research 2017:45(D1)
0 Citations (Google Scholar as of 2017-02-15)
Abstract: RepeatsDB 2.0 (URL: http://repeatsdb.bio.unipd.it/) is an update of the database of annotated tandem repeat protein structures. Repeat proteins are a widespread class of non-globular proteins carrying heterogeneous functions involved in several diseases. Here we provide a new version of RepeatsDB with an improved classification schema including high quality annotations for ∼5400 protein structures. RepeatsDB 2.0 features information on start and end positions for the repeat regions and units for all entries. The extensive growth of repeat unit characterization was possible by applying the novel ReUPred annotation method over the entire Protein Data Bank, with data quality is guaranteed by an extensive manual validation for >60% of the entries. The updated web interface includes a new search engine for complex queries and a fully re-designed entry page for a better overview of structural data. It is now possible to compare unit positions, together with secondary structure, fold information and Pfam domains. Moreover, a new classification level has been introduced on top of the existing scheme as an independent layer for sequence similarity relationships at 40%, 60% and 90% identity. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
- RepeatsDB: a database of tandem repeat protein structures. [PMID: 24311564]
Tomás Di Domenico, Emilio Potenza, Ian Walsh, R Gonzalo Parra, Manuel Giollo, Giovanni Minervini, Damiano Piovesan, Awais Ihsan, Carlo Ferrari, Andrey V Kajava, Silvio C E Tosatto
Nucleic acids research 2014:42(Database issue)
33 Citations (Google Scholar as of 2017-02-15)
Abstract: RepeatsDB (http://repeatsdb.bio.unipd.it/) is a database of annotated tandem repeat protein structures. Tandem repeats pose a difficult problem for the analysis of protein structures, as the underlying sequence can be highly degenerate. Several repeat types haven been studied over the years, but their annotation was done in a case-by-case basis, thus making large-scale analysis difficult. We developed RepeatsDB to fill this gap. Using state-of-the-art repeat detection methods and manual curation, we systematically annotated the Protein Data Bank, predicting 10,745 repeat structures. In all, 2797 structures were classified according to a recently proposed classification schema, which was expanded to accommodate new findings. In addition, detailed annotations were performed in a subset of 321 proteins. These annotations feature information on start and end positions for the repeat regions and units. RepeatsDB is an ongoing effort to systematically classify and annotate structural protein repeats in a consistent way. It provides users with the possibility to access and download high-quality datasets either interactively or programmatically through web services.