Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
Database enables integration of genomic and phenomic data by providing access to primary experimental data, data collection protocols and analysis tools. Data represent behavioral, morphological and physiological disease-related characteristics in naive mice and those exposed to drugs, environmental agents or other treatments. Collaborative standardized collection of measured data on laboratory mouse strains to characterize them in order to facilitate translational discoveries and to assist in selection of strains for experimental studies. Includes baseline phenotype data sets as well as studies of drug, diet, disease and aging effect., protocols, projects and publications, and SNP, variation and gene expression studies. Provides tools for online analysis. Data sets are voluntarily contributed by researchers from variety of institutions and settings, or retrieved by MPD staff from open public sources. MPD has three major types of strain-centric data sets: phenotype strain surveys, SNP and variation data, and gene expression strain surveys. MPD collects data on classical inbred strains as well as any fixed-genotype strains and derivatives that are openly acquirable by the research community. New panels include Collaborative Cross (CC) lines and Diversity Outbred (DO) populations. Phenotype data include measurements of behavior, hematology, bone mineral density, cholesterol levels, endocrine function, aging processes, addiction, neurosensory functions, and other biomedically relevant areas. Genotype data are primarily in the form of single-nucleotide polymorphisms (SNPs). MPD curates data into a common framework by standardizing mouse strain nomenclature, standardizing units (SI where feasible), evaluating data (completeness, statistical power, quality), categorizing phenotype data and linking to ontologies, conforming to internal style guides for titles, tags, and descriptions, and creating comprehensive protocol documentation including environmental parameters of the test animals. These elements are critical for experimental reproducibility.
Proper citation: Mouse Phenome Database (MPD) (RRID:SCR_003212) Copy
Database to catalog experimentally determined interactions between proteins combining information from a variety of sources to create a single, consistent set of protein-protein interactions that can be downloaded in a variety of formats. The data were curated, both, manually and also automatically using computational approaches that utilize the the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. Because the reliability of experimental evidence varies widely, methods of quality assessment have been developed and utilized to identify the most reliable subset of the interactions. This CORE set can be used as a reference when evaluating the reliability of high-throughput protein-protein interaction data sets, for development of prediction methods, as well as in the studies of the properties of protein interaction networks. Tools are available to analyze, visualize and integrate user's own experimental data with the information about protein-protein interactions available in the DIP database. The DIP database lists protein pairs that are known to interact with each other. By interact they mean that two amino acid chains were experimentally identified to bind to each other. The database lists such pairs to aid those studying a particular protein-protein interaction but also those investigating entire regulatory and signaling pathways as well as those studying the organization and complexity of the protein interaction network at the cellular level. Registration is required to gain access to most of the DIP features. Registration is free to the members of the academic community. Trial accounts for the commercial users are also available.
Proper citation: Database of Interacting Proteins (DIP) (RRID:SCR_003167) Copy
Software R-package for running gene set analysis using various statistical methods, from different gene level statistics and a wide range of gene-set collections. The Piano package contains functions for combining the results of multiple runs of gene set analyses.
Proper citation: Piano (RRID:SCR_003200) Copy
http://cmb.molgen.mpg.de/2ndGenerationSequencing/Solas/
Software package for the statistical language R, devoted to the analysis of next generation short read data of RNA-seq transcripts. It provides predictions of alternative exons in a single condition/cell sample, predictions of differential alternative exons between two conditions/cell samples, and quantification of alternative splice forms in a single condition/cell sample.
Proper citation: Solas (RRID:SCR_003168) Copy
http://www.broadinstitute.org/cancer/software/genepattern
A powerful genomic analysis platform that provides access to hundreds of tools for gene expression analysis, proteomics, SNP analysis, flow cytometry, RNA-seq analysis, and common data processing tasks. A web-based interface provides easy access to these tools and allows the creation of multi-step analysis pipelines that enable reproducible in silico research.
Proper citation: GenePattern (RRID:SCR_003201) Copy
http://bibiserv.techfak.uni-bielefeld.de/dialign/
Tool for multiple sequence alignment using various sources of external information that is particularly useful to detect local homologies in sequences with low overall similarity. While standard alignment methods rely on comparing single residues and imposing gap penalties, DIALIGN constructs pairwise and multiple alignments by comparing entire segments of the sequences. No gap penalty is used. This approach can be used for both global and local alignment, but it is particularly successful in situations where sequences share only local homologies. Several versions of DIALIGN are available online at GOBICS, http://dialign.gobics.de/
Proper citation: DIALIGN (RRID:SCR_003041) Copy
http://www.ebi.ac.uk/Tools/pfa/iprscan/
Software package for functional analysis of sequences by classifying them into families and predicting presence of domains and sites. Scans sequences against InterPro's signatures. Characterizes nucleotide or protein function by matching it with models from several different databases. Used in large scale analysis of whole proteomes, genomes and metagenomes. Available as Web based version and standalone Perl version and SOAP Web Service.
Proper citation: InterProScan (RRID:SCR_005829) Copy
ToppGene Suite is a one-stop portal for gene list enrichment analysis and candidate gene prioritization based on functional annotations and protein interactions network. ToppGene Suite is a one-stop portal for (i) gene list functional enrichment, (ii) candidate gene prioritization using either functional annotations or network analysis and (iii) identification and prioritization of novel disease candidate genes in the interactome. Functional annotation-based disease candidate gene prioritization uses a fuzzy-based similarity measure to compute the similarity between any two genes based on semantic annotations. The similarity scores from individual features are combined into an overall score using statistical meta-analysis.
Proper citation: ToppGene Suite (RRID:SCR_005726) Copy
http://llama.mshri.on.ca/gofish/GoFishWelcome.html
Software program, available as a Java applet online or to download, allows the user to select a subset of Gene Ontology (GO) attributes, and ranks genes according to the probability of having all those attributes., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: GoFish (RRID:SCR_005682) Copy
http://maq.sourceforge.net/maqview.shtml
A graphical read alignment viewer specifically designed for the Maq alignment file and allows you to see the mismatches, base qualities and mapping qualities. It is highly efficient in speed, memory and disk usage. Maqview is based on OpenGL and is known to work on both Mac OS X and Linux. Porting to Windows is in principle easy.
Proper citation: Maqview (RRID:SCR_005632) Copy
http://crdd.osdd.net/raghava/ccpdb/
ccPDB (Compilation and Creation of datasets from PDB) is designed to provide service to scientific community working in the field of function or structure annoation of proteins. This database of datasets is based on Protein Data Bank (PDB), where all datasets were derived from PDB. ccPDB have four modules; i) compilation of datasets, ii) creation of datasets, iii) web services and iv) Important links. * Compilation of Datasets: Datasets at ccPDB can be classified in two categories, i) datasets collected from literature and ii) datasets compiled from PDB. We are in process of collecting PDB datasetsfrom literature and maintaining at ccPDB. We are also requesting community to suggest datasets. In addition, we generate datasets from PDB, these datasets were generated using commonly used standard protocols like non-redundant chains, structures solved at high resolution. * Creation of datasets: This module developed for creating customized datasets where user can create a dataset using his/her conditions from PDB. This module will be useful for those users who wish to create a new dataset as per ones requirement. This module have six steps, which are described in help page. * Web Services: We integrated following web services in ccPDB; i) Analyze of PDB ID service allows user to submit their PDB on around 40 servers from single point, ii) BLAST search allows user to perform BLAST search of their protein against PDB, iii) Structural information service is designed for annotating a protein structure from PDB ID, iv) Search in PDB facilitate user in searching structures in PDB, v)Generate patterns service facility to generate different types of patterns required for machine learning techniques and vi) Download useful information allows user to download various types of information for a given set of proteins (PDB IDs). * Important Links: One of major objectives of this web site is to provide links to web servers related to functional annotation of proteins. In first phase we have collected and compiled these links in different categories. In future attempt will be made to collect as many links as possible.
Proper citation: ccPDB - Compilation and Creation of datasets from PDB (RRID:SCR_005870) Copy
UTRdb/UTRsite is a portal to other databases, including Nucleotide Sequence Databases, Protein Sequence Databases, other Sequence databanks, Untranslated Nucleotide Sequence Databases, Mitochondrial Databases, Mutation Databases, and others. The site also allows users to start long-term permanent projects or just to do quick searches, depending on the user''s needs.
Proper citation: UTRdb/UTRsite (RRID:SCR_005868) Copy
http://stormo.wustl.edu/ScerTF
Catalog of over 1,200 position weight matrices (PWMs) for 196 different yeast transcription factors (TFs). They've curated 11 literature sources, benchmarked the published position-specific scoring matrices against in-vivo TF occupancy data and TF deletion experiments, and combined the most accurate models to produce a single collection of the best performing weight matrices for Saccharomyces cerevisiae. ScerTF is useful for a wide range of problems, such as linking regulatory sites with transcription factors, identifying a transcription factor based on a user-input matrix, finding the genes bound/regulated by a particular TF, and finding regulatory interactions between transcription factors. Enter a TF name to find the recommended matrix for a particular TF, or enter a nucleotide sequence to identify all TFs that could bind a particular region.
Proper citation: ScerTF (RRID:SCR_006121) Copy
http://www-bionet.sscc.ru/sitex/
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on August 19,2019. Analyzing protein structure projection on exon-intron structure of corresponding gene through years led to several fundamental conclusions about structural and functional organization of the protein. According to these results we decided to map the protein functional sites. So we created the database SitEx that keep the information about this mapping and included the BLAST search and 3D similar structure search using PDB3DScan for the polypeptide encoded by one exon, participating in organizing the functional site. This will help: # to study the positions of the functional sites in exon structure; # to make the complex analysis of the protein function; # to exposure the exons that took part in exon shuffling and came from bacterial genomes; # to study the peculiarities of coding the polypeptide structures. Currently, SitEx contains information about 9994 functional sites presented in 2021 proteins described in proteomes of 17 organisms.
Proper citation: SitEx (RRID:SCR_006122) Copy
http://bio-bigdata.hrbmu.edu.cn/diseasemeth/
Human disease methylation database. DiseaseMeth version 2.0 is focused on aberrant methylomes of human diseases. Used for understanding of DNA methylation driven human diseases.
Proper citation: DiseaseMeth (RRID:SCR_005942) Copy
The DistiLD database aims to increase the usage of existing genome-wide association studies (GWAS) results by making it easy to query and visualize disease-associated SNPs and genes in their chromosomal context. The database performs three important tasks: # published GWAS are collected from several sources and linked to standardized, international disease codes ICD10 codes) # data from the International HapMap Project are analyzed to define linkage disequilibrium (LD) blocks onto which SNPs and genes are mapped # the web interface makes it easy to query and visualize disease-associated SNPs and genes within LD blocks. Users can query the database by diseases, SNPs or genes. No matter which of the three query modes was used, an intermediate page will be shown listing all the studies that matched the search with a link to the corresponding publication. The user can select either all studies related to a certain disease or one specific study for which to view the related LD blocks. The DistiLD resource integrates information on: * Associations between Single Nucleotide Polymorphisms (SNPs) and diseases from genome-wide association studies (GWAS) * Links between SNPs and genes based on linkage disequilibrium (LD) data from HapMap For convenience, we provide the complete datasets as two (zipped) tab-delimited files. The first file contains GWAS results mapped to LD blocks. The second file contains all SNPs and genes assigned to each LD block.
Proper citation: DistiLD - Diseases and Traits in LD (RRID:SCR_005943) Copy
https://compbio.dfci.harvard.edu/predictivenetworks//
A flexible, open-source, web-based application and data services framework that enables the integration, navigation, visualization and analysis of gene interaction networks. The primary goal of PN is to allow biomedical researchers to evaluate experimentally derived gene lists in the context of large-scale gene interaction networks. The PN analytical pipeline involves two key steps. The first is the collection of a comprehensive set of known gene interactions derived from a variety of publicly available sources. The second is to use these ''known'' interactions together with gene expression data to infer robust gene networks. The regression-based network inference algorithm creates a graph of gene interactions in which cycles may be present (but no self-loops). Based on information-theoretic techniques, a causal gene interaction network is inferred from both prior knowledge (interactions extracted from biomedical literature and structured biological databases) and gene expression data. A prediction model is fitted for each gene, given its parents, enabling assessment of the predictive ability of the network model.
Proper citation: Predictive Networks (RRID:SCR_006110) Copy
http://newt-omics.mpi-bn.mpg.de/index.php
Newt-omics is a database, which enables researchers to locate, retrieve and store data sets dedicated to the molecular characterization of newts. Newt-omics is a transcript-centered database, based on an Expressed Sequence Tag (EST) data set from the newt, covering ~50,000 Sanger sequenced transcripts and a set of high-density microarray data, generated from regenerating hearts. Newt-omics also contains a large set of peptides identified by mass spectrometry, which was used to validate 13,810 ESTs as true protein coding. Newt-omics is open to implement additional high-throughput data sets without changing the database structure. Via a user-friendly interface Newt-omics allows access to a huge set of molecular data without the need for prior bioinformatical expertise. The newt Notopthalmus viridescens is the master of regeneration. This organism is known for more than 200 years for its exceptional regenerative capabilities. Newts can completely replace lost appendages like limb and tail, lens and retina and parts of the central nervous system. Moreover, after cardiac injury newts can rebuild the functional myocardium with no scar formation. To date only very limited information from public databases is available. Newt-Omics aims to provide a comprehensive platform of expressed genes during tissue regeneration, including extensive annotations, expression data and experimentally verified peptide sequences with yet no homology to other publicly available gene sequences. The goal is to obtain a detailed understanding of the molecular processes underlying tissue regeneration in the newt, that may lead to the development of approaches, efficiently stimulating regenerative pathways in mammalians. * Number of contigs: 26594 * Number of est in contigs: 48537 * Number of transcripts with verified peptide: 5291 * Number of peptides: 15169
Proper citation: Newtomics (RRID:SCR_006073) Copy
http://www.nematodes.org/nembase4/
NEMBASE is a comprehensive Nematode Transcriptome Database including 63 nematode species, over 600,000 ESTs and over 250,000 proteins. Nematode parasites are of major importance in human health and agriculture, and free-living species deliver essential ecosystem services. The genomics revolution has resulted in the production of many datasets of expressed sequence tags (ESTs) from a phylogenetically wide range of nematode species, but these are not easily compared. NEMBASE4 presents a single portal into extensively functionally annotated, EST-derived transcriptomes from over 60 species of nematodes, including plant and animal parasites and free-living taxa. Using the PartiGene suite of tools, we have assembled the publicly available ESTs for each species into a high-quality set of putative transcripts. These transcripts have been translated to produce a protein sequence resource and each is annotated with functional information derived from comparison with well-studied nematode species such as Caenorhabditis elegans and other non-nematode resources. By cross-comparing the sequences within NEMBASE4, we have also generated a protein family assignment for each translation. The data are presented in an openly accessible, interactive database. An example of the utility of NEMBASE4 is that it can examine the uniqueness of the transcriptomes of major clades of parasitic nematodes, identifying lineage-restricted genes that may underpin particular parasitic phenotypes, possible viral pathogens of nematodes, and nematode-unique protein families that may be developed as drug targets.
Proper citation: NEMBASE (RRID:SCR_006070) Copy
http://hfv.lanl.gov/content/index
The Hemorrhagic Fever Viruses (HFV) sequence database collects and stores sequence data and provides a user-friendly search interface and a large number of sequence analysis tools, following the model of the highly regarded and widely used Los Alamos HIV database. The database uses an algorithm that aligns each sequence to a species-wide reference sequence. The NCBI RefSeq database is used for this; if a reference sequence is not available, a Blast search finds the best candidate. Using this method, sequences in each genus can be retrieved pre-aligned. Hemorrhagic fever viruses (HFVs) are a diverse set of over 80 viral species, found in 10 different genera comprising five different families: arena-, bunya-, flavi-, filo- and togaviridae. All these viruses are highly variable and evolve rapidly, making them elusive targets for the immune system and for vaccine and drug design. About 55,000 HFV sequences exist in the public domain today. A central website that provides annotated sequences and analysis tools will be helpful to HFV researchers worldwide.
Proper citation: HFV Database (RRID:SCR_006017) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.