Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
A large collection of tools for basic and advanced analyses of nucleotide and protein sequences. The tools are wrapped into a common user interface making handling, storage, retrieval and viewing the results easy and logical. A seqtools project can accommodate many thousand sequences making unattended batch analyses like database searching at NCBI painless with the robust search engine included in seqtools.
Proper citation: SEQtools (RRID:SCR_008579) Copy
https://gemini.readthedocs.io/en/latest/
Framework for exploring genetic variation in the context of the genome annotations available for the human genome. Users can load a VCF file into a database and each variant is automatically annotated by comparing it to several genome annotations from source such as ENCODE tracks, UCSC tracks, OMIM, dbSNP, KEGG, and HPRD.
Proper citation: GEMINI (RRID:SCR_014819) Copy
A non profit organization dedicated to providing support for patients and families with Alzheimer's disease, to educating the public about the disease, to funding a wide range of Alzheimer's disease related research and to finding ways to treat and eventually to prevent Alzheimer's disease. Resources include: the Alzheimer's Association Green-Field Library, a research grants program, and the Journal of the Alzheimer's Association.
Proper citation: Alzheimers Association (RRID:SCR_007398) Copy
Curated collection of human metabolite and human metabolism data which contains records for endogenous metabolites, with each metabolite entry containing detailed chemical, physical, biochemical, concentration, and disease information. This is further supplemented with thousands of NMR and MS spectra collected on purified reference metabolites.
Proper citation: HMDB (RRID:SCR_007712) Copy
A database which supports high-throughput NMR and MS approaches to the identification and quantification of metabolites present in biological samples. MMCD serves as a hub for information on small molecules of biological interest gathered from electronic databases and the scientific literature. Each metabolite entry in the MMCD is supported by information in separate data fields, which provide the chemical formula, names and synonyms, structure, physical and chemical properties, NMR and MS data on pure compounds under defined conditions where available, NMR chemical shifts determined by empirical and/or theoretical approaches, calculated isotopomer masses, information on the presence of the metabolite in different biological species, and links to images, references, and other public databases. The MMCD search engine supports versatile data mining and allows users to make individual or bulk queries on the basis of experimental NMR and/or MS data plus other criteria.
Proper citation: Madison Metabolomics Consortium Database (RRID:SCR_007803) Copy
https://bioinf.eva.mpg.de/patman/
Software that searches for short patterns in large DNA databases, allowing for approximate matches., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: PatMaN (RRID:SCR_011821) Copy
Software for searching DNA sequence databases for RNA structure and sequence similarities.
Proper citation: Infernal (RRID:SCR_011809) Copy
http://www.mycancergenome.org/
A freely available online personalized cancer medicine knowledge resource for physicians, patients, caregivers and researchers that gives up-to-date information on what mutations make cancers grow and related therapeutic implications, including available clinical trials. It is a one-stop tool that matches tumor mutations to therapies, making information accessible and convenient for busy clinicians.
Proper citation: My Cancer Genome (RRID:SCR_004140) Copy
An integrated resource to analyze signaling pathway cross-talks, transcription factors, miRNAs and regulatory enzymes. The multi-layered database structure is made up of signaling pathways, their pathway regulators (e.g., scaffold and endocytotic proteins) and modifier enzymes (e.g., phosphatases, ubiquitin ligases), as well as transcriptional and post-transcriptional regulators of all of these components. The website allows the interactive exploration of how each signaling protein is regulated. Features * experimental data not only from humans but from two invertebrate model organisms, C. elegans and D. melanogaster; * combines manual curation with large-scale datasets; * provides confidence scores for each interaction; * operates a customizable download page with multiple file formats (e.g., BioPAX, Cytoscape, SBML).
Proper citation: SignaLink (RRID:SCR_003569) Copy
http://tools.niehs.nih.gov/polg/
Database that lists all known mutations in the coding region of the POLG gene and describes the associated disease. Human DNA polymerase is composed of two subunits, a 140 kDa catalytic subunit encoded by the POLG on chromosome 15q25, and a 55kDa accessory subunit encoded by the POLG2 gene on chromosome 17q23-24. A number of mutations have been mapped to the gene for the catalytic subunit of DNA polymerase, POLG, and found to be associated with mitochondrial diseases. The nucleotide changes are numbered from the initiation Methionine codon and are based on the cDNA (accession U60325.1) and gene sequence (accession AF497906.1).
Proper citation: Human DNA Polymerase Gamma Mutation Database (RRID:SCR_004722) Copy
A database of three-dimensional protein models calculated by comparative modeling. ModBase is organized into datasets, which are either available to the public, to the academic community, or to specific users. 20 unique amidohydrolase and 41 unique enolase structures have been determined have been included in the database.
Proper citation: ModBase (RRID:SCR_004642) Copy
https://www.stanleygenomics.org/
The Stanley Online Genomics Database uses samples from the Stanley Medical Research Institute (SMRI) Brain Bank. These samples were processed and run on gene expression arrays by a variety of researchers in collaboration with the SMRI. These researchers have performed analyses on their respective studies using a range of analytic approaches. All of the genomic data have been aggregated in this online database, and a consistent set of analyses have been applied to each study. Additionally, a comprehensive set of cross-study analyses have been performed. A thorough collection of gene expression summaries are provided, inclusive of patient demographics, disease subclasses, regulated biological pathways, and functional classifications. Raw data is also available to download. The database is derived from two sets of brain samples, the Stanley Array collection and the Stanley Consortium collection. The Stanley Array collection contains 105 patients, and the Stanley Consortium collection contains 60 patients. Multiple genomic studies have been conducted using these brain samples. From these studies, twelve were selected for inclusion in the database on the basis of number of patients studied, genomic platform used, and data quality. The Consortium collection studies have fewer patients but more diversity in brain regions and array platforms, while the Array collection studies are more homogenous. There are tradeoffs, the Consortium results will be more variable, but findings may be more broadly representative. The collections contain brain samples from subjects in four main groups: Bipolar Schizophrenia, Depression, and Controls Brain regions used in the studies include: Broadman Area 6, Broadman Area 8/9, Broadman Area 10, Broadman Area 46, Cerebellum The 12 studies encompass a range of microarray platforms: Affymetrix HG-U95Av2, Affymetrix HG-U133A, Affymetrix HG-U133 2.0+, Codelink Human 20K, Agilent Human I, Custom cDNA Publications based on any of the clinical or genomic data should credit the Stanley Medical Research Institute, as well as any individual SMRI collaborators whose data is being used. Publications which make use of analytic results/methods in the database should additionally cite Dr. Michael Elashoff. Registration is required to access the data.
Proper citation: Stanley Medical Research Institute Online Genomics Database (RRID:SCR_004859) Copy
A comprehensive collection of human transcription factor binding sites models. DNA sequences of TF binding regions obtained by both pregenomic and high-throughput methods were collected from existing databases and other public data. The ChIPMunk software was used to construct positional weight matrices. Four motif discovery strategies were tested based on different motif shape priors including flat and periodic priors associated with DNA helix pitch. A quality rating was manually assigned to each model based on known binding preferences. An appropriate TFBS model was selected for each TF, with similar models selected for related TFs. In any case only one model per TF was selected unless there was additional evidence for two distinct binding models or different stable modes of dimerization. All TFBS models and initial binding segments data used for motif discovery were mapped to UniPROT IDs.
Proper citation: HOCOMOCO (RRID:SCR_005409) Copy
http://deepbase.sysu.edu.cn/chipbase/
A database for decoding transcription factor binding maps, expression profiles and transcriptional regulation of long non-coding RNAs (lncRNAs, lincRNAs), microRNAs, other ncRNAs (snoRNAs, tRNAs, snRNAs, etc.) and protein-coding genes from ChIP-Seq data. ChIPBase currently includes millions of transcription factor binding sites (TFBSs) among 6 species. ChIPBase provides several web-based tools and browsers to explore TF-lncRNA, TF-miRNA, TF-mRNA, TF-ncRNA and TF-miRNA-mRNA regulatory networks., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: ChIPBase (RRID:SCR_005404) Copy
Database of known and predicted protein interactions. The interactions include direct (physical) and indirect (functional) associations and are derived from four sources: Genomic Context, High-throughput experiments, (Conserved) Coexpression, and previous knowledge. STRING quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable. The database currently covers 5''214''234 proteins from 1133 organisms. (2013)
Proper citation: STRING (RRID:SCR_005223) Copy
http://www.membranetransport.org
TransportDB is a relational database describing the predicted cytoplasmic membrane transport protein complement for organisms whose complete genome sequence are available. For each organism, its complete membrane transport complement was identified, classified into protein families according to the TC classification system, and functional predictions are provided.For each organism, a summary page is available, overviewing the whole transporter system, including transporter types and individual transporter families. For individual transporter types, a detailed list of transporters with their possible substrates is shown with links to individual protein page which contains protein sequence and annotation information. You can also compare the transporter system from two or more different organisms. A search engine is set up for easy search in our transporter database for transporter type, family, individual proteins and their substrates. You can also blast search your protein sequence against our transporter database.With the rapid development of genomic sequencing both in TIGR and in other institutes, more and more genomes are available for the analysis of their transporter system. We will keep updating this site with the newly published genomes. If you have any suggestions, corrections, or comments on our site, please contact us. We are currently working on providing additional functionality for this database.
Proper citation: TransportDB (RRID:SCR_005643) Copy
Collects mammalian cis- and trans-regulatory elements together with experimental evidence. Regulatory elements were mapped on to assembled genomes. Resource for gene regulation and function studies. Users can retrieve primers, search TF target genes, retrieve TF motifs, search Gene Regulatory Networks and orthologs, and make use of sequence analysis tools. Uses databases such as Genbank, EPD and DBTSS, and employ promoter finding program FirstEF combined with mRNA/EST information and cross-species comparisons. Manually curated.
Proper citation: Transcriptional Regulatory Element Database (RRID:SCR_005661) Copy
A curated repository of more than 206000 regulatory associations between transcription factors (TF) and target genes in Saccharomyces cerevisiae, based on more than 1300 bibliographic references. It also includes the description of 326 specific DNA binding sites shared among 113 characterized TFs. Further information about each Yeast gene has been extracted from the Saccharomyces Genome Database (SGD). For each gene the associated Gene Ontology (GO) terms and their hierarchy in GO was obtained from the GO consortium. Currently, YEASTRACT maintains a total of 7130 terms from GO. The nucleotide sequences of the promoter and coding regions for Yeast genes were obtained from Regulatory Sequence Analysis Tools (RSAT). All the information in YEASTRACT is updated regularly to match the latest data from SGD, GO consortium, RSA Tools and recent literature on yeast regulatory networks. YEASTRACT includes DISCOVERER, a set of tools that can be used to identify complex motifs found to be over-represented in the promoter regions of co-regulated genes. DISCOVERER is based on the MUSA algorithm. These algorithms take as input a list of genes and identify over-represented motifs, which can then be compared with transcription factor binding sites described in the YEASTRACT database.
Proper citation: Yeast Search for Transcriptional Regulators And Consensus Tracking (RRID:SCR_006076) Copy
http://operons.ibt.unam.mx/OperonPredictor/
The Prokaryotic Operon DataBase (ProOpDB) constitutes one of the most precise and complete repository of operon predictions in our days. Using our novel and highly accurate operon algorithm, we have predicted the operon structures of more than 1,200 prokaryotic genomes. ProOpDB offers diverse alternatives by which a set of operon predictions can be retrieved including: i) organism name, ii) metabolic pathways, as defined by the KEGG database, iii) gene orthology, as defined by the COG database, iv) conserved protein motifs, as defined by the Pfam database, v) reference gene, vi) reference operon, among others. In order to limit the operon output to non-redundant organisms, ProOpDB offers an efficient protocol to select the more representative organisms based on a precompiled phylogenetic distances matrix. In addition, the ProOpDB operon predictions are used directly as the input data of our Gene Context Tool (GeConT) to visualize their genomic context and retrieve the sequence of their corresponding 5�� regulatory regions, as well as the nucleotide or amino acid sequences of their genes. The prediction algorithm The algorithm is a multilayer perceptron neural network (MLP) classifier, that used as input the intergenic distances of contiguous genes and the functional relationship scores of the STRING database between the different groups of orthologous proteins, as defined in the COG database. Nevertheless, the operon prediction of our method is not restricted to only those genes with a COG assignation, since we successfully defined new groups of orthologous genes and obtained, by extrapolation, a set of equivalent STRING-like scores based on conserved gene pairs on different genomes. Since the STRING functional relationships scores are determined in an un-bias manner and efficiently integrates a large amount of information coming from different sources and kind of evidences, the prediction made by our MLP are considerably less influenced by the bias imposed in the training procedure using one specific organism.
Proper citation: ProOpDB (RRID:SCR_006111) Copy
An integrated database of human maladies and their annotations, modeled on the architecture and richness of the popular GeneCards database of human genes. The database contains 17,705 diseases, consolidated from 28 sources.
Proper citation: MalaCards (RRID:SCR_005817) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.