Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
Tool used to design PCR primers from DNA sequence - often in high-throughput genomics applications. It does everything from mispriming libraries to sequence quality data to the generation of internal oligos.
Proper citation: Primer3 (RRID:SCR_003139) Copy
Database enables integration of genomic and phenomic data by providing access to primary experimental data, data collection protocols and analysis tools. Data represent behavioral, morphological and physiological disease-related characteristics in naive mice and those exposed to drugs, environmental agents or other treatments. Collaborative standardized collection of measured data on laboratory mouse strains to characterize them in order to facilitate translational discoveries and to assist in selection of strains for experimental studies. Includes baseline phenotype data sets as well as studies of drug, diet, disease and aging effect., protocols, projects and publications, and SNP, variation and gene expression studies. Provides tools for online analysis. Data sets are voluntarily contributed by researchers from variety of institutions and settings, or retrieved by MPD staff from open public sources. MPD has three major types of strain-centric data sets: phenotype strain surveys, SNP and variation data, and gene expression strain surveys. MPD collects data on classical inbred strains as well as any fixed-genotype strains and derivatives that are openly acquirable by the research community. New panels include Collaborative Cross (CC) lines and Diversity Outbred (DO) populations. Phenotype data include measurements of behavior, hematology, bone mineral density, cholesterol levels, endocrine function, aging processes, addiction, neurosensory functions, and other biomedically relevant areas. Genotype data are primarily in the form of single-nucleotide polymorphisms (SNPs). MPD curates data into a common framework by standardizing mouse strain nomenclature, standardizing units (SI where feasible), evaluating data (completeness, statistical power, quality), categorizing phenotype data and linking to ontologies, conforming to internal style guides for titles, tags, and descriptions, and creating comprehensive protocol documentation including environmental parameters of the test animals. These elements are critical for experimental reproducibility.
Proper citation: Mouse Phenome Database (MPD) (RRID:SCR_003212) Copy
Database to catalog experimentally determined interactions between proteins combining information from a variety of sources to create a single, consistent set of protein-protein interactions that can be downloaded in a variety of formats. The data were curated, both, manually and also automatically using computational approaches that utilize the the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. Because the reliability of experimental evidence varies widely, methods of quality assessment have been developed and utilized to identify the most reliable subset of the interactions. This CORE set can be used as a reference when evaluating the reliability of high-throughput protein-protein interaction data sets, for development of prediction methods, as well as in the studies of the properties of protein interaction networks. Tools are available to analyze, visualize and integrate user's own experimental data with the information about protein-protein interactions available in the DIP database. The DIP database lists protein pairs that are known to interact with each other. By interact they mean that two amino acid chains were experimentally identified to bind to each other. The database lists such pairs to aid those studying a particular protein-protein interaction but also those investigating entire regulatory and signaling pathways as well as those studying the organization and complexity of the protein interaction network at the cellular level. Registration is required to gain access to most of the DIP features. Registration is free to the members of the academic community. Trial accounts for the commercial users are also available.
Proper citation: Database of Interacting Proteins (DIP) (RRID:SCR_003167) Copy
Software R-package for running gene set analysis using various statistical methods, from different gene level statistics and a wide range of gene-set collections. The Piano package contains functions for combining the results of multiple runs of gene set analyses.
Proper citation: Piano (RRID:SCR_003200) Copy
http://cmb.molgen.mpg.de/2ndGenerationSequencing/Solas/
Software package for the statistical language R, devoted to the analysis of next generation short read data of RNA-seq transcripts. It provides predictions of alternative exons in a single condition/cell sample, predictions of differential alternative exons between two conditions/cell samples, and quantification of alternative splice forms in a single condition/cell sample.
Proper citation: Solas (RRID:SCR_003168) Copy
http://www.broadinstitute.org/cancer/software/genepattern
A powerful genomic analysis platform that provides access to hundreds of tools for gene expression analysis, proteomics, SNP analysis, flow cytometry, RNA-seq analysis, and common data processing tasks. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research.
Proper citation: GenePattern (RRID:SCR_003201) Copy
http://bibiserv.techfak.uni-bielefeld.de/dialign/
Tool for multiple sequence alignment using various sources of external information that is particularly useful to detect local homologies in sequences with low overall similarity. While standard alignment methods rely on comparing single residues and imposing gap penalties, DIALIGN constructs pairwise and multiple alignments by comparing entire segments of the sequences. No gap penalty is used. This approach can be used for both global and local alignment, but it is particularly successful in situations where sequences share only local homologies. Several versions of DIALIGN are available online at GOBICS, http://dialign.gobics.de/
Proper citation: DIALIGN (RRID:SCR_003041) Copy
http://pir.georgetown.edu/pirwww/dbinfo/pirsf.shtml
A SuperFamily classification system, with rules for functional site and protein name, to facilitate the sensible propagation and standardization of protein annotation and the systematic detection of annotation errors. The PIRSF concept is being used as a guiding principle to provide comprehensive and non-overlapping clustering of UniProtKB sequences into a hierarchical order to reflect their evolutionary relationships. The PIRSF classification system is based on whole proteins rather than on the component domains; therefore, it allows annotation of generic biochemical and specific biological functions, as well as classification of proteins without well-defined domains. There are different PIRSF classification levels. The primary level is the homeomorphic family, whose members are both homologous (evolved from a common ancestor) and homeomorphic (sharing full-length sequence similarity and a common domain architecture). At a lower level are the subfamilies which are clusters representing functional specialization and/or domain architecture variation within the family. Above the homeomorphic level there may be parent superfamilies that connect distantly related families and orphan proteins based on common domains. Because proteins can belong to more than one domain superfamily, the PIRSF structure is formally a network. The FTP site provides free download for PIRSF.
Proper citation: PIRSF (RRID:SCR_003352) Copy
Centralized, standards compliant, public data repository for proteomics data, including protein and peptide identifications, post-translational modifications and supporting spectral evidence. Originally it was developed to provide a common data exchange format and repository to support proteomics literature publications. This remit has grown with PRIDE, with the hope that PRIDE will provide a reference set of tissue-based identifications for use by the community. The future development of PRIDE has become closely linked to HUPO PSI. PRIDE encourages and welcomes direct user submissions of protein and peptide identification data to be published in peer-reviewed publications. Users may Browse public datasets, use PRIDE BioMart for custom queries, or download the data directly from the FTP site. PRIDE has been developed through a collaboration of the EMBL-EBI, Ghent University in Belgium, and the University of Manchester.
Proper citation: Proteomics Identifications (PRIDE) (RRID:SCR_003411) Copy
http://www.ichip.de/software/SplicingCompass.html
Software for detection of differential splicing between two different conditions using RNA-Seq data.
Proper citation: SplicingCompass (RRID:SCR_003249) Copy
https://bitbucket.org/dranew/defuse
Software package for gene fusion discovery using RNA-Seq data. It uses clusters of discordant paired end alignments to inform a split read alignment analysis for finding fusion boundaries.
Proper citation: deFuse (RRID:SCR_003279) Copy
http://www.ebi.ac.uk/thornton-srv/databases/WSsas/
SAS is a tool for applying structural information to a given protein sequence. It uses FASTA to scan a given protein sequence against all the proteins of known 3D structure in the Protein Data Bank and provides functional residue annotation based on data from the Catalytic Site Atlas and PDBsum. The web service is aimed to facilitate the use of the SAS tool when having a huge number of queries. Currently, the web service provides annotation for binding sites (to ligand, metal or nucleic acid), catalytic residues and amino acids related to protein-protein interactions.
Proper citation: WSsas - Web Service for the SAS tool (RRID:SCR_007051) Copy
http://autismkb.cbi.pku.edu.cn/
Genetic factors contribute significantly to ASD. AutismKB is an evidence-based knowledgebase of Autism spectrum disorder (ASD) genetics. The current version contains 2193 genes (99 syndromic autism related genes and 2135 non-syndromic autism related genes), 4617 Copy Number Variations (CNVs) and 158 linkage regions associated with ASD by one or more of the following six experimental methods: # Genome-Wide Association Studies (GWAS); # Genome-wide CNV studies; # Linkage analysis; # Low-scale genetic association studies; # Expression profiling; # Other low-scale gene studies. Based on a scoring and ranking system, 99 syndromic autism related genes and 383 non-syndromic autism related genes (434 genes in total) were designated as having high confidence. Autism spectrum disorder (ASD) is a heterogeneous neurodevelopmental disorder with a prevalence of 1.0-2.6%. The three core symptoms of ASD are: # impairments in reciprocal social interaction; # communication impairments; # presence of restricted, repetitive and stereotyped patterns of behavior, interests and activities.
Proper citation: AutismKB (RRID:SCR_006937) Copy
http://weizhong-lab.ucsd.edu/cd-hit/
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 28,2023. Software program for clustering biological sequences with many applications in various fields such as making non-redundant databases, finding duplicates, identifying protein families, filtering sequence errors and improving sequence assembly etc. It is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset. The CD-HIT package has CD-HIT, CD-HIT-2D, CD-HIT-EST, CD-HIT-EST-2D, CD-HIT-454, CD-HIT-PARA, PSI-CD-HIT, CD-HIT-OTU and over a dozen scripts. * CD-HIT (CD-HIT-EST) clusters similar proteins (DNAs) into clusters that meet a user-defined similarity threshold. * CD-HIT-2D (CD-HIT-EST-2D) compares 2 datasets and identifies the sequences in db2 that are similar to db1 above a threshold. * CD-HIT-454 identifies natural and artificial duplicates from pyrosequencing reads. * CD-HIT-OTU cluster rRNA tags into OTUs The usage of other programs and scripts can be found in CD-HIT user''s guide. CD-HIT was originally developed by Dr. Weizhong Li at Dr. Adam Godzik''s Lab at the Burnham Institute (now Sanford-Burnham Medical Research Institute)., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: CD-HIT (RRID:SCR_007105) Copy
This database provides a platform to query and compare gene expression data during the development of the major model animals (zebrafish, drosophila, medaka, mouse). The name 4DXpress stands for expression database in 4D. The 4D (four dimensions) of 4DXpress can be interpreted either as: 3 spatial dimensions plus time, or as 1. species 2. gene 3. developmental stage 4. anatomical structure. The major focus of this database lies in cross species comparison. The high resolution expression data was acquired through whole mount in situ hybridsation-, antibody- or transgenic experiments. Data was integrated from several species specific expression pattern databases, such as ZFIN, BDGP, GXD, MEPD as well as directly submitted by researchers of the participating groups at EMBL. The 4DXpress database is a project within the Centre for Computational Biology at EMBL. It is developed by Yannick Haudry, Thorsten Henrich and Ivica Letunic and coordinated by Thorsten Henrich. Hugo Berube is developing the 4D ArrayExpress Data Warehouse at EBI for integrating in situ data with microarray data.
Proper citation: Expression Database in 4D (RRID:SCR_007066) Copy
Database containing the DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented; the most up to date collation of sequence, gene, and other annotations from all databases (eg. Celera published, NCBI, Ensembl, RIKEN, UCSC) as well as unpublished data. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. The objective of this project is to generate a comprehensive description of human chromosome 7 to facilitate biological discovery, disease gene research and medical genetic applications. There are over 360 disease-associated genes or loci on chromosome 7. A major challenge ahead will be to represent chromosome alterations, variants, and polymorphisms and their related phenotypes (or lack thereof), in an accessible way. In addition to being a primary data source, this site serves as a weighing station for testing community ideas and information to produce highly curated data to be submitted to other databases such as NCBI, Ensembl, and UCSC. Therefore, any useful data submitted will be curated and shown in this database. All Chromosome 7 genomic clones (cosmids, BACs, YACs) listed in GBrowser and in other data tables are freely distributed.
Proper citation: Chromosome 7 Annotation Project (RRID:SCR_007134) Copy
https://github.com/jstjohn/SimSeq
An illumina paired-end and mate-pair short read simulator. This project attempts to model as many of the quirks that exist in Illumina data as possible. Some of these quirks include the potential for chimeric reads, and non-biotinylated fragment pull down in mate-pair libraries .
Proper citation: SimSeq (RRID:SCR_006947) Copy
Resource for experimentally validated human and mouse noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation in other vertebrates or epigenomic evidence (ChIP-Seq) of putative enhancer marks. Central public database of experimentally validated human and mouse noncoding fragments with gene enhancer activity as assessed in transgenic mice. Users can retrieve elements near single genes of interest, search for enhancers that target reporter gene expression to particular tissue, or download entire collections of enhancers with defined tissue specificity or conservation depth.
Proper citation: VISTA Enhancer Browser (RRID:SCR_007973) Copy
https://www.ncbi.nlm.nih.gov/genbank/dbest/
Database as a division of GenBank that contains sequence data and other information on single-pass cDNA sequences, or Expressed Sequence Tags, from a number of organisms.
Proper citation: dbEST (RRID:SCR_008132) Copy
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on August 26,2019. In October 2016, T1DBase has merged with its sister site ImmunoBase (https://immunobase.org). Documented on March 2020, ImmunoBase ownership has been transferred to Open Targets (https://www.opentargets.org). Results for all studies can be explored using Open Targets Genetics (https://genetics.opentargets.org). Database focused on genetics and genomics of type 1 diabetes susceptibility providing a curated and integrated set of datasets and tools, across multiple species, to support and promote research in this area. The current data scope includes annotated genomic sequences for suspected T1D susceptibility regions; genetic data; microarray data; and global datasets, generally from the literature, that are useful for genetics and systems biology studies. The site also includes software tools for analyzing the data.
Proper citation: T1DBase (RRID:SCR_007959) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.