- DBTSS as an integrative platform for transcriptome, epigenome and genome sequence variation data. [PMID: 25378318]
Ayako Suzuki, Hiroyuki Wakaguri, Riu Yamashita, Shin Kawano, Katsuya Tsuchihara, Sumio Sugano, Yutaka Suzuki, Kenta Nakai
Nucleic acids research 2015:43(Database issue)
4 Citations (Google Scholar as of 2015-12-25)
Abstract: DBTSS (http://dbtss.hgc.jp/) was originally constructed as a collection of uniquely determined transcriptional start sites (TSSs) in humans and some other species in 2002. Since then, it has been regularly updated and in recent updates epigenetic information has also been incorporated because such information is useful for characterizing the biological relevance of these TSSs/downstream genes. In the newest release, Release 9, we further integrated public and original single nucleotide variation (SNV) data into our database. For our original data, we generated SNV data from genomic analyses of various cancer types, including 97 lung adenocarcinomas and 57 lung small cell carcinomas from Japanese patients as well as 26 cell lines of lung cancer origin. In addition, we obtained publically available SNV data from other cancer types and germline variations in total of 11,322 individuals. With these updates, users can examine the association between sequence variation pattern in clinical lung cancers with its corresponding TSS-seq, RNA-seq, ChIP-seq and BS-seq data. Consequently, DBTSS is no longer a mere storage site for TSS information but has evolved into an integrative platform of a variety of genome activity data. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
- DBTSS: DataBase of Transcriptional Start Sites progress report in 2012. [PMID: 22086958]
Riu Yamashita, Sumio Sugano, Yutaka Suzuki, Kenta Nakai
Nucleic acids research 2012:40(Database issue)
33 Citations (Google Scholar as of 2016-03-28)
Abstract: To support transcriptional regulation studies, we have constructed DBTSS (DataBase of Transcriptional Start Sites), which contains exact positions of transcriptional start sites (TSSs), determined with our own technique named TSS-seq, in the genomes of various species. In its latest version, DBTSS covers the data of the majority of human adult and embryonic tissues: it now contains 418 million TSS tag sequences from 28 tissues/cell cultures. Moreover, we integrated a series of our own transcriptomic data, such as the RNA-seq data of subcellular-fractionated RNAs as well as the ChIP-seq data of histone modifications and the binding of RNA polymerase II/several transcription factors in cultured cell lines into our original TSS information. We also included several external epigenomic data, such as the chromatin map of the ENCODE project. We further associated our TSS information with public or original single-nucleotide variation (SNV) data, in order to identify SNVs in the regulatory regions. These data can be browsed in our new viewer, which supports versatile search conditions of users. We believe that our new DBTSS will be an invaluable resource for interpreting the differential uses of TSSs and for identifying human genetic variations that are associated with disordered transcriptional regulation. DBTSS can be accessed at http://dbtss.hgc.jp.
- DBTSS provides a tissue specific dynamic view of Transcription Start Sites. [PMID: 19910371]
Riu Yamashita, Hiroyuki Wakaguri, Sumio Sugano, Yutaka Suzuki, Kenta Nakai
Nucleic acids research 2010:38(Database issue)
39 Citations (Google Scholar as of 2016-03-04)
Abstract: DataBase of Transcription Start Sites (DBTSS) is a database which contains precise positional information for transcription start sites (TSSs) of eukaryotic mRNAs. In this update, we included 330 million new tags generated by massively sequencing the 5'-end of oligo-cap selected cDNAs in humans and mice. The tags were collected from normal fetal or adult human tissues, including brain, thymus, liver, kidney and heart, from 6 human cell lines in 21 diverse growth conditions as well as from mouse NIH3T3 cell line: altogether 31 different cell types or culture conditions are represented. This unprecedented increase in depth of data now allows DBTSS to faithfully represent the dynamically changing landscape of TSSs in different cell types and conditions, during development and in the course of evolution. Differential usage of alternative 5'-ends across cell types and conditions can be viewed in a series of new interfaces. Promoter sequence information is now displayed in a comparative genomics viewer where evolutionary turnover of the TSSs can be evaluated. DBTSS can be accessed at http://dbtss.hgc.jp/.
- DBTSS: database of transcription start sites, progress report 2008. [PMID: 17942421]
Hiroyuki Wakaguri, Riu Yamashita, Yutaka Suzuki, Sumio Sugano, Kenta Nakai
Nucleic acids research 2008:36(Database issue)
148 Citations (Google Scholar as of 2016-03-03)
Abstract: DBTSS is a database of transcriptional start sites, based on our unique collection of precise, experimentally determined 5'-end sequences of full-length cDNAs. Since its first release in 2002, several major updates have been made. In this update, we expanded the human transcriptional start site dataset by 19 million uniquely mapped, and RefSeq-associated, 5'-end sequences, which were generated by a newly introduced Solexa sequencer. Moreover, in order to provide means for interpreting those massive TSS data, we implemented two new analytical tools: one for connecting expression information with predicted transcription factor binding sites; the other for examining evolutionary conservation or species-specificity of promoters and transcripts, which can be browsed by our own comparative genome viewer. With the expanded dataset and the enhanced functionalities, DBTSS provides a unique platform that enables in-depth transcriptome analyses. DBTSS is accessible at http://dbtss.hgc.jp/.
- DBTSS: DataBase of Human Transcription Start Sites, progress report 2006. [PMID: 16381981]
Riu Yamashita, Yutaka Suzuki, Hiroyuki Wakaguri, Katsuki Tsuritani, Kenta Nakai, Sumio Sugano
Nucleic acids research 2006:34(Database issue)
114 Citations (Google Scholar as of 2016-03-03)
Abstract: DBTSS was first constructed in 2002 based on precise, experimentally determined 5' end clones. Several major updates and additions have been made since the last report. First, the number of human clones has drastically increased, going from 190,964 to 1,359,000. Second, information about potential alternative promoters is presented because the number of 5' end clones is now sufficient to determine several promoters for one gene. Namely, we defined putative promoter groups by clustering transcription start sites (TSSs) separated by <500 bases. A total of 8308 human genes and 4276 mouse genes were found to have putative multiple promoters. Third, DBTSS provides detailed sequence comparisons of user-specified TSSs. Finally, we have added TSS information for zebrafish, malaria and schyzon (a red algae model organism). DBTSS is accessible at http://dbtss.hgc.jp.
- DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. [PMID: 14681363]
Yutaka Suzuki, Riu Yamashita, Sumio Sugano, Kenta Nakai
Nucleic acids research 2004:32(Database issue)
169 Citations (Google Scholar as of 2016-03-03)
Abstract: DBTSS (http://dbtss.hgc.jp) was originally constructed based on a collection of experimentally determined TSSs of human genes. Since its first release in 2002, it has been updated several times. First, the amount of stored data has increased significantly: e.g. the number of clones that match both the RefSeq mRNA set and the genome sequence has increased from 111,382 to 190,964, now covering 1,234 genes. Second, the positions of SNPs in dbSNP were displayed on the upstream regions of contained human genes. Third, DBTSS now covers other species such as mouse and the human malaria parasite. It will become a central database containing data for many more species with oligo-capping and related methods. Lastly, the database now serves for comparative promoter analyses: in the current version, comparative views of potentially orthologous promoters from human and mouse are presented with an additional function of searching potential transcription-factor binding sites, which are either conserved or diverged between species.
- DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs. [PMID: 11752328]
Yutaka Suzuki, Riu Yamashita, Kenta Nakai, Sumio Sugano
Nucleic acids research 2002:30(1)
260 Citations (Google Scholar as of 2016-03-03)
Abstract: Although the information of cDNAs is indispensable for analyzing gene function, most of the cDNA sequences stored in current databases are imperfect in the sense that they lack the precise information of 5' end termini. To overcome this difficulty, we have developed the oligo-capping method to obtain full-length cDNAs, the information of which has been partly deposited in public databases. In this study, we further constructed human cDNA libraries enriched in clones containing the cap structure to systematically explore the 5' end structure of expressed genes. Of approximately 217 402 5' end sequences obtained, 111 382 have been matched to cDNA sequences of known genes (7889 genes) and are presented in our new database, DataBase of Transcriptional Start Sites (DBTSS; http://elmo.ims.u-tokyo.ac.jp/dbtss/). Sequence comparison between our entries and those of a reference sequence database, RefSeq, revealed that 4683 (34%) of RefSeq sequences should be extended towards the 5' ends. We also mapped each sequence on the human draft genome sequence to identify its transcriptional start site, which provides us with more detailed information on distribution patterns of transcriptional start sites and adjacent regulatory regions.