Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
A curated repository of more than 206000 regulatory associations between transcription factors (TF) and target genes in Saccharomyces cerevisiae, based on more than 1300 bibliographic references. It also includes the description of 326 specific DNA binding sites shared among 113 characterized TFs. Further information about each Yeast gene has been extracted from the Saccharomyces Genome Database (SGD). For each gene the associated Gene Ontology (GO) terms and their hierarchy in GO was obtained from the GO consortium. Currently, YEASTRACT maintains a total of 7130 terms from GO. The nucleotide sequences of the promoter and coding regions for Yeast genes were obtained from Regulatory Sequence Analysis Tools (RSAT). All the information in YEASTRACT is updated regularly to match the latest data from SGD, GO consortium, RSA Tools and recent literature on yeast regulatory networks. YEASTRACT includes DISCOVERER, a set of tools that can be used to identify complex motifs found to be over-represented in the promoter regions of co-regulated genes. DISCOVERER is based on the MUSA algorithm. These algorithms take as input a list of genes and identify over-represented motifs, which can then be compared with transcription factor binding sites described in the YEASTRACT database.
Proper citation: Yeast Search for Transcriptional Regulators And Consensus Tracking (RRID:SCR_006076) Copy
http://prism.ccbb.ku.edu.tr/hotregion/index.php
Hot spots are energetically important residues at protein interfaces and they are not randomly distributed across the interface but rather clustered. These clustered hot spots form hot regions. Hot regions are important for the stability of protein complexes, as well as providing specificity to binding sites. HotRegion provides the hot region information of the interfaces by using predicted hot spot residues, and structural properties of these interface residues such as pair potentials of interface residues, accessible surface area (ASA) and relative ASA values of interface residues of both monomer and complex forms of proteins. Also, the 3D visualization of the interface and interactions among hot spot residues are provided. The number of interfaces in the database is 147909 and still growing.
Proper citation: HotRegion - A Database of Cooperative Hotspots (RRID:SCR_006022) Copy
GlycomeDB is a database of all known carbohydrate structures. This was achieved by crosslinking several other databases of carbohydrate structures by using the GlycoCT XML language specification. We have analyzed all of the existing public databases and defined a sequence format based on XML (GlycoCT) capable of storing all structural information of carbohydrate sequences. We have implemented a library of parsers for the interpretation of the different encoding schemes for carbohydrates. With this library we have translated the carbohydrate sequences of all freely available databases (CFG , KEGG, GLYCOSCIENCES.de, BCSDB and Carbbank) to GlycoCT, and created a new database (GlycomeDB) containing all structures and annotations. During the process of data integration we found multiple inconsistencies in the existing databases which were corrected in collaboration with the responsible curators. With the new database, GlycomeDB, it is possible to get an overview of all carbohydrate structures in the different databases and to crosslink common structures in the different databases. Scientists are now able to search for a particular structure in the meta database and get information about the occurrence of this structure in the five carbohydrate structure databases.
Proper citation: glycomedb (RRID:SCR_005717) Copy
http://tropgenedb.cirad.fr/tropgene/JSP/index.jsp
A database that manages genetic and genomic information about tropical crops studied by Cirad. The database is organised into crop specific modules. Each module includes data on genetic ressources (agro-morphological data, parentages, allelic diversity), information on molecular markers, genetics maps, result of QTL analyses, data from physical mapping, sequences, genes, as well as corresponding references. GENE DB interface has been designed to allow quick consultations as well as complex queries. Nine modules are presently on line.
Proper citation: TropGENE DB (RRID:SCR_005716) Copy
DOMMINO is a comprehensive structural database on macromolecular interactions. As of June, 2011, it contains more than 407,000 binary interactions. The distinctive features of DOMMINO are: # Automated updates: DOMMINO is fully automated and is designed to update itself on a weekly basis, one day after a PDB weekly update. Thus, the community will be able to study macromolecular interactions almost immediately after they are released by PDB. # Coverage of non-domain mediated interactions: In addition to domain-domain and domain-peptide interactions the database characterizes the interaction between domains and unstructured protein regions that are not parts of a domain, such as inter-domain linkers and N- and C-termini. The interactions that involve the latter unstructured parts of proteins have been included to the database for the first time providing additional ~186,000 interactions (~45% of the total number of interactions, as of June, 2011). # Coverage of new structural domains: DOMMINO employs one of the most accurate structural classifications of proteins, SCOP. In addition to the existing SCOP-annotated domains, we employ a state-of-the-art machine learning approach to classify newer protein structures into existing SCOP families. With the progress of structural genomics, we do not expect a significant growth of the number of structurally novel folds or protein families and therefore our method allows covering almost all new protein structures. In total, using this predictive approach has allowed us to add more than 261,000 new interactions, almost twice as many as existing SCOP-annotated interactions. # The web-interface is designed to give the user a possibility of a flexible search as well as the capability to study macromolecular interactions in a PDB structure at the interaction network level and at the individual interface level. The web interface of the DOMMINO database includes a comprehensive list of help topics linked to the specific actions. In addition, we have designed a step-by-step tutorial that covers all aspects of working with the data from DOMMINO using the web interface.
Proper citation: DOMMINO - Database Of MacroMolecular INteractiOns (RRID:SCR_005958) Copy
http://www.jcvi.org/charprotdb/index.cgi/home
The Characterized Protein Database, CharProtDB, is designed and being developed as a resource of expertly curated, experimentally characterized proteins described in published literature. For each protein record in CharProtDB, storage of several data types is supported. It includes functional annotation (several instances of protein names and gene symbols) taxonomic classification, literature links, specific Gene Ontology (GO) terms and GO evidence codes, EC (Enzyme Commisssion) and TC (Transport Classification) numbers and protein sequence. Additionally, each protein record is associated with cross links to all public accessions in major protein databases as ��synonymous accessions��. Each of the above data types can be linked to as many literature references as possible. Every CharProtDB entry requires minimum data types to be furnished. They are protein name, GO terms and supporting reference(s) associated to GO evidence codes. Annotating using the GO system is of importance for several reasons; the GO system captures defined concepts (the GO terms) with unique ids, which can be attached to specific genes and the three controlled vocabularies of the GO allow for the capture of much more annotation information than is traditionally captured in protein common names, including, for example, not just the function of the protein, but its location as well. GO evidence codes implemented in CharProtDB directly correlate with the GO consortium definitions of experimental codes. CharProtDB tools link characterization data from multiple input streams through synonymous accessions or direct sequence identity. CharProtDB can represent multiple characterizations of the same protein, with proper attribution and links to database sources. Users can use a variety of search terms including protein name, gene symbol, EC number, organism name, accessions or any text to search the database. Following the search, a display page lists all the proteins that match the search term. Click on the protein name to view more detailed annotated information for each protein. Additionally, each protein record can be annotated.
Proper citation: CharProtDB: Characterized Protein Database (RRID:SCR_005872) Copy
http://pbildb1.univ-lyon1.fr/virhostnet/
Public knowledge base specialized in the management and analysis of integrated virus-virus, virus-host and host-host interaction networks coupled to their functional annotations. It contains high quality and up-to-date information gathered and curated from public databases (VirusMint, Intact, HIV-1 database). It allows users to search by host gene, host/viral protein, gene ontology function, KEGG pathway, Interpro domain, and publication information. It also allows users to browse viral taxonomy.
Proper citation: VirHostNet: Virus-Host Network (RRID:SCR_005978) Copy
THIS RESOURCE IS NO LONGER IN SERVICE, documented August 22, 2016. Database for corrected read counts and genome mapping on NCBI's Short Read Archive. The corrected count was done using RECOUNT and the mapping with LAST. We also provide information of reference genome to which we aligned the short reads. We focus on transcriptomic data, specifically TSS-Seq and RNA-Seq. Because this is the type of data for which sequence count correction is most important. Hence we do not include the genomic reads. The current version contains 2,265 entries from 45 organisms, with read lengths from 17 to 100bp. Via a searchable and browseable interface users can obtain corrected data in formats useful for transcriptomic analysis. We provide the data grouped according to the genome, type of studies and submitter in TAB , PSL and BAM format. They contain the mapping position and annotation of reads observed and corrected counts.
Proper citation: RecountDB (RRID:SCR_006117) Copy
http://prorepeat.bioinformatics.nl/
ProRepeat is an integrated curated repository and analysis platform for in-depth research on the biological characteristics of amino acid tandem repeats. ProRepeat collects repeats from all proteins included in the UniProt knowledgebase, together with 85 completely sequenced eukaryotic proteomes contained within the RefSeq collection. It contains non-redundant perfect tandem repeats, approximate tandem repeats and simple, low-complexity sequences, covering the majority of the amino acid tandem repeat patterns found in proteins. The ProRepeat web interface allows querying the repeat database using repeat characteristics like repeat unit and length, number of repetitions of the repeat unit and position of the repeat in the protein. Users can also search for repeats by the characteristics of repeat containing proteins, such as entry ID, protein description, sequence length, gene name and taxon. ProRepeat offers powerful analysis tools for finding biological interesting properties of repeats, such as the strong position bias of leucine repeats in the N-terminus of eukaryotic protein sequences, the differences of repeat abundance among proteomes, the functional classification of repeat containing proteins and GC content constrains of repeats' corresponding codons.
Proper citation: ProRepeat (RRID:SCR_006113) Copy
The database of protein-chemical structural interactions includes all existing 3D structures of complexes of proteins with low molecular weight ligands. When one considers the proteins and chemical vertices of a graph, all these interactions form a network. Biological networks are powerful tools for predicting undocumented relationships between molecules. The underlying principle is that existing interactions between molecules can be used to predict new interactions. For pairs of proteins sharing a common ligand, we use protein and chemical superimpositions combined with fast structural compatibility screens to predict whether additional compounds bound by one protein would bind the other. The current version includes data from the Protein Data Bank as of August 2011. The database is updated monthly.
Proper citation: ProtChemSI (RRID:SCR_006115) Copy
High quality ribosomal RNA databases providing comprehensive, quality checked and regularly updated datasets of aligned small (16S/18S, SSU) and large subunit (23S/28S, LSU) ribosomal RNA (rRNA) sequences for all three domains of life (Bacteria, Archaea and Eukarya). Supplementary services include a rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. The extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches. Alignment tool, SINA, is available for download as well as available for use online.
Proper citation: SILVA (RRID:SCR_006423) Copy
ViralZone is a SIB Swiss Institute of Bioinformatics web-resource for all viral genus and families, providing general molecular and epidemiological information, along with virion and genome figures. Each virus or family page gives an easy access to UniProtKB/Swiss-Prot viral protein entries. ViralZone project is handled by the virus program of SwissProt group. Proteins popups were developed in collaboration with Prof. Christian von Mering and Andrea Franceschini, Bioinformatics Group , Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland, funded in part by the SIB Swiss Institute of bioinformatics. All pictures in ViralZone are copyright of the SIB Swiss Institute of Bioinformatics.
Proper citation: ViralZone (RRID:SCR_006563) Copy
A database of elecrophysiological properties text-mined from the biomedical literature as a function of neuron type. Specifically, NeuroElectro seeks to extract information about the electrophysiological properties (e.g. resting membrane potentials and membrane time constants) of diverse neuron types from the existing literature and place it into a centralized database. There are 252 neurons currently available, with the naming convention established in NeuroLex.
Proper citation: neuroelectro (RRID:SCR_006274) Copy
Scansite searches for motifs within proteins that are likely to be phosphorylated by specific protein kinases or bind to domains such as SH2 domains, 14-3-3 domains or PDZ domains. The Motifscanner program utilizes an entropy approach that assesses the probability of a site matching the motif using the selectivity values and sums the logs of the probability values for each amino acid in the candidate sequence. The program then indicates the percentile ranking of the candidate motif in respect to all potential motifs in proteins of a protein database. When available, percentile scores of some confirmed phosphorylation sites for the kinase of interests or confirmed binding sites of the domain of interest are provided for comparison with the scores of the candidate motifs.
Proper citation: Scansite (RRID:SCR_007026) Copy
http://www.iiserpune.ac.in/~coee/histome/
Database of human histone variants, sites of their post-translational modifications and various histone modifying enzymes. The database covers 5 types of histones, 8 types of their post-translational modifications and 13 classes of modifying enzymes. Many data fields are hyperlinked to other databases (e.g. UnprotKB/Swiss-Prot, HGNC, OMIM, Unigene etc.). Additionally, this database also provides sequences of promoter regions (-700 TSS +300) for all gene entries. These sequences were extracted from the UCSC genome browser. Sites of post-translational modifications of histones were manually searched from PubMed listed literature. Current version contains information for about ~50 histone proteins and ~150 histone modifying enzymes. HIstome is a combined effort of researchers from two institutions, Advanced Center for Treatment, Research and Education in Cancer (ACTREC), Navi Mumbai and Center of Excellence in Epigenetics (CoEE), Indian Institute of Science Education and Research (IISER), Pune.
Proper citation: HIstome: The Histone Infobase (RRID:SCR_006972) Copy
http://microkit.biocuckoo.org/
MiCroKit database is the first integrative resource to pin point most of identified components and related scientific information of midbody, centrosome and kinetochore. In this work, we have collected all proteins identified to be localized on kinetochore, centrosome, and/or midbody from two fungi (S. cerevisiae and S. pombe) and five animals, including C. elegans, D. melanogaster, X. laevis, M. musculus and H. sapiens. From the related literature of PubMed, numerous proteins have been manually curated to be localized on at least one of the sub-cellular localizations of kinetochore, centrosome and midbody. And to promise the quality of data, based on the rationale of Seeing is believing (Bloom K et al., 2005), these proteins have been unambiguously observed under fluorescent microscope as directly supportive evidences. Then an integrated and searchable database MiCroKit - Midbody, Centrosome and Kinetochore has been established. The version 1.0 of MiCroKit database was set up on Nov. 2nd, 2005, containing 1,065 unique proteins. The MiCroKit version 2.0 was released on Jun. 5th, 2006, with 1,120 entries. Currently, the MiCroKit 3.0 database was updated on July 9, 2009, containing 1,489 unique protein entries. The online service of MiCroKit 3.0 was implemented in PHP + MySQL + JavaScript. And the local packages of MiCroKit 3.0 were developed in JAVA 1.5 (J2SE). The database will be updated routinely as new microkit proteins are reported.
Proper citation: Midbody, Centrosome and Kinetochore (RRID:SCR_007052) Copy
BioCarta Pathways allows users to observe how genes interact in dynamic graphical models. Online maps available within this resource depict molecular relationships from areas of active research. In an open source approach, this community-fed forum constantly integrates emerging proteomic information from the scientific community. It also catalogs and summarizes important resources providing information for over 120,000 genes from multiple species. Find both classical pathways as well as current suggestions for new pathways.
Proper citation: BioCarta Pathways (RRID:SCR_006917) Copy
http://www.imgt.org/IMGTindex/LIGM.html
IMGT/LIGM-DB is a comprehensive database of immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences from human and other vertebrate species (270). IMGT/LIGM-DB includes all germline (non-rearranged) and rearranged IG and TR genomic DNA (gDNA) and complementary DNA (cDNA) sequences published in generalist databases. IMGT/LIGM-DB allows searches from the Web interface according to biological and immunogenetic criteria through five distinct modules depending on the user interest. Users can search the catalogue by accession number, mnemonic, definition, creation date, length, or annotation level. They also have the option to search through taxonomic classification, keywords, and annotated labels. For a given entry, nine types of display are available including the IMGT flat file, the translation of the coding regions and the analysis by the IMGT/V-QUEST tool (see parent org. below). IMGT/LIGM-DB distributes expertly annotated sequences. The annotations hugely enhance the quality and the accuracy of the distributed detailed information. They include the sequence identification, the gene and allele classification, the constitutive and specific motif description, the codon and amino acid numbering, and the sequence obtaining information, according to the main concepts of IMGT-ONTOLOGY. They represent the main source of IG and TR gene and allele knowledge stored in IMGT/GENE-DB and in the IMGT reference directory., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: IMGT/LIGM-DB (RRID:SCR_006931) Copy
Database that collects all arabidopsis transcription factors (totally 1922 Loci; 2290 Gene Models) and classifies them into 64 families. It uses not only locus (gene), but also gene model (transcript, protein) and the detail information is for each gene model not for locus. It adds multiple alignment of the DNA-binding domain of each family, Neighbor-Joining phylogenetic tree of each family, the GO annotation, homolog with the Database of Rice Transcription Factors (DRTF). It also keeps old information items such as the unique cloned and sequenced information of about 1200 transcription factors, protein domains, 3D structure information with BLAST hits against PDB, predicted Nuclear Location Signals, UniGene information, as well as links to literature reference.
Proper citation: Database of Arabidopsis Transcription Factors (RRID:SCR_007101) Copy
http://www.polygenicpathways.co.uk
Database of disease genes and risk factors and of host pathogen/interactomes. Lists genes, pathways and environmental risk factors positively associated with diseases and conditions such as Alzheimer's disease, schizophrenia, multiple sclerosis, childhood obesity, anorexia nervosa, HIV-1/AIDS, and helicobacter pylori. Details of polymorphisms as well as negative/positive association data can be found via Useful links. Throughout the site are links to Entrez Gene and Pubmed.
Proper citation: Polygenic Pathways (RRID:SCR_006962) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.