Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
http://genolist.pasteur.fr/Colibri/
Database dedicated to the analysis of the genome of Escherichia coli. Its purpose is to collate and integrate various aspects of the genomic information from E. coli, the paradigm of Gram-negative bacteria. Colibri provides a complete dataset of DNA and protein sequences derived from the paradigm strain E. coli K-12, linked to the relevant annotations and functional assignments. It allows one to easily browse through these data and retrieve information, using various criteria (gene names, location, keywords, etc.). The data contained in Colibri originates from two major sources of information, the reference genomic DNA sequence from the E. coli Genome Project and the feature annotations from the EcoGene data collection., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: Colibri (RRID:SCR_007606) Copy
http://mendel.gene.cwru.edu/adamslab/cgi-bin/paml/pbrowser.py
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 16, 2013. It provides access to the results of tests for positive selection in 14,000 human genes. Multiple alignments of protein-coding regions of genes from human and other mammals were extracted from whole-genome alignments available from UC-Santa Cruz. Each gene was analyzed using the maximum likelihood tests of selection using PAML. Branch, site, and branch+site tests were performed, each with at least one matching null model.
Proper citation: Human PAML Browser (RRID:SCR_007715) Copy
dbPTM is a database that compiles information on protein post-translational modifications (PTM) such as the modified sites, solvent accessibility of surrounding amino acids, protein secondary and tertiary structures, protein domains, and protein variations. The version 2.0 of dbPTM integrates the experimentally validated PTM sites with referable literatures from Swiss-Prot, Phospho.ELM, O-GLYCBASE, and UbiProt. In all of the collected PTM information, about 25 types of PTM with enough experimentally validated sites are trained the profile hidden Markov models (HMMs) to detect the potential PTM sites with 100% specificity against Swiss-Prot proteins. To help users investigating more detail in each type of PTM, the substrate peptide specificity such as positional amino acid frequency, solvent accessibility and secondary structure surrounding the modified sites are also provided. Moreover, the information of orthologous protein clusters is provided to users for analyzing whether the PTM sites located in the evolutionary conserved regions or not., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: dbPTM: An informational repository of proteins and post-translational modifications (RRID:SCR_007619) Copy
http://firedb.bioinfo.cnio.es/
A database of Protein Data Bank structures, ligands and annotated functional site residues. The database can be accessed by PDB codes or UniProt accession numbers as well as keywords. FireDB contains information on every chemical compound in the PDB, including their descriptions, the PDB structures in which the compounds are found and the amino acids that are in contact with the ligand.
Proper citation: FireDB (RRID:SCR_007655) Copy
FCP is a publicly accessible web tool dedicated to analyzing the current state and trends of available proteome structures along the classification schemes of enzymes and nuclear receptors. It offers both graphical and quantitative data on the degree of functional coverage in that portion of the proteome by existing structures and on the bias observed in the distribution of those structures among proteins. Users can choose to search the website based on structures or ligands, and can also sort by enzyme or receptor. Users can also view data based on structural and population (species) filters.
Proper citation: Functional Coverage of the Proteome (RRID:SCR_007654) Copy
http://caps.ncbs.res.in/imotdb/
Comprehensive collection of spatially interacting motifs in proteins. Interacting motif database lists interacting motifs that are identified for all structural entries in PDB. Conserved patterns or finger prints are identified for individual structural entries and also grouped together for reporting common motifs shared among all superfamily members.
Proper citation: Database of Spatially Interacting Motifs in Proteins (RRID:SCR_007735) Copy
A database ofhuman disease-related mutated proteins identified by mass-spectrometry (MS). For achieving this goal, we collected human mutated sequences known to be related to diseases till now. After surveying mutated sequence sources: PMD, OMIM, SwissProt polymorphism, HGMD, etc, we found that currently HGMD contains the largest human gene mutation information. However, because, for academic users, HGMD does not provide with whole data download service, we decided to systematically extract and curate mutation information from PMD, OMIM, SwissProt, MSIPI database to form SysPIMP and provide it free for academic users.
Proper citation: Systematic Platform for Identifying Mutated Proteins (SysPIMP) (RRID:SCR_007954) Copy
SYSTERS is a database of protein sequences grouped into homologous families and superfamilies. The SYSTERS project aims to provide a meaningful partitioning of the whole protein sequence space by a fully automatic procedure. A refined two-step algorithm assigns each protein to a family and a superfamily. The sequence data underlying SYSTERS release 4 now comprise several protein sequence databases derived from completely sequenced genomes (ENSEMBL, TAIR, SGD and GeneDB), in addition to the comprehensive Swiss-Prot/TrEMBL databases. To augment the automatically derived results, information from external databases like Pfam and Gene Ontology are added to the web server. Furthermore, users can retrieve pre-processed analyses of families like multiple alignments and phylogenetic trees. New query options comprise a batch retrieval tool for functional inference about families based on automatic keyword extraction from sequence annotations. A new access point, PhyloMatrix, allows the retrieval of phylogenetic profiles of SYSTERS families across organisms with completely sequenced genomes. Gene, Human, Vertebrate, Genome, Human ORFs
Proper citation: SYSTERS (RRID:SCR_007955) Copy
http://supfam.org/SUPERFAMILY/
SUPERFAMILY is a database of structural and functional protein annotations for all completely sequenced organisms. The SUPERFAMILY annotation is based on a collection of hidden Markov models, which represent structural protein domains at the SCOP superfamily level. A superfamily groups together domains which have an evolutionary relationship. The annotation is produced by scanning protein sequences from over 1,700 completely sequenced genomes against the hidden Markov models.
Proper citation: SUPERFAMILY (RRID:SCR_007952) Copy
Database to explore known and predicted interactions of chemicals and proteins. It integrates information about interactions from metabolic pathways, crystal structures, binding experiments and drug-target relationships. Inferred information from phenotypic effects, text mining and chemical structure similarity is used to predict relations between chemicals. STITCH further allows exploring the network of chemical relations, also in the context of associated binding proteins. Each proposed interaction can be traced back to the original data sources. The database contains interaction information for over 68,000 different chemicals, including 2200 drugs, and connects them to 1.5 million genes across 373 genomes and their interactions contained in the STRING database.
Proper citation: Search Tool for Interactions of Chemicals (RRID:SCR_007947) Copy
Collection of transmembrane protein datasets containing experimentally derived topology information from the literature and from public databases. Web interface of TOPDB includes tools for searching, relational querying and data browsing, visualisation tools for topology data.
Proper citation: Topology Data Bank of Transmembrane Proteins (RRID:SCR_007964) Copy
It provides a database based on a pre-computed similarity matrix covering the similarity space formed by >4 million amino acid sequences from public databases and completely sequenced genomes. The database is capable of handling very large datasets and is updated incrementally. For sequence similarity searches and pairwise alignments, we implemented a grid-enabled software system, which is based on FASTA heuristics and the Smith Waterman algorithm. SimpleSIMAP and AdvancedSIMAP retrieve homologs for given protein sequences that need to be contained in the SIMAP database. While SimpleSIMAP provides only selected parameters and preconfigured search spaces, the AdvancedSIMAP allows the user to specify search space, filtering and sorting parameters in a flexible manner. Both types of queries result in lists of homologs that are linked in turn to their homologs. So the web interfaces allow users to explore quickly and interactively the protein world by homology. Sponsors: SIMAP is supported by the Department of Genome Oriented Bioinformatics of the Technische Universitt Mnchen and the Institute for Bioinformatics of the GSF-National Research Center for Environment and Health.
Proper citation: SIMAP (RRID:SCR_007927) Copy
http://www.ncbi.nlm.nih.gov/guide/sitemap/
The National Center for Biotechnology Information''s listing of resources. Sort by alphabetical character, Databases, Downloads, Submissions, Tools and How-To; or by Topic: Chemicals & Bioassays; Data & Software; DNA & RNA; Domains & Structures; Genes & Expression; Genetics & Medicine; Genomes & Maps; Homology; Literature; Proteins; Sequence Analysis; Taxonomy; Training & Tutorials; Variation.
Proper citation: NCBI Resource List (RRID:SCR_005628) Copy
Database of hundreds of thousands of products submitted by reagent provider partners, and millions of webpages selected from reagent suppliers. All are organized according to genes, species, and reagent types (antibodies, recombinant proteins, ELISA, siRNA, cDNA clones, biochemicals, and others).
Proper citation: Labome (RRID:SCR_007384) Copy
http://noble.gs.washington.edu/proj/sdp-svm/
A statistical framework for genomic data fusion is a computational framework for integrating and drawing inferences from a collection of genome-wide measurements. Each dataset is represented via a kernel function, which defines generalized similarity relationships between pairs of entities, such as genes or proteins. The kernel representation is both flexible and efficient, and can be applied to many different types of data. Furthermore, kernel functions derived from different types of data can be combined in a straightforward fashion. Recent advances in the theory of kernel methods have provided efficient algorithms to perform such combinations in a way that minimizes a statistical loss function. These methods exploit semidefinite programming techniques to reduce the problem of finding optimizing kernel combinations to a convex optimization problem. Computational experiments performed using yeast genome-wide datasets, including amino acid sequences, hydropathy profiles, gene expression data and known protein-protein interactions, demonstrate the utility of this approach. A statistical learning algorithm trained from all of these data to recognize particular classes of proteins--membrane proteins and ribosomal proteins--performs significantly better than the same algorithm trained on any single type of data. Matlab code to center a kernel matrix and Matlab code for normalization are available.
Proper citation: A statistical framework for genomic data fusion (RRID:SCR_007219) Copy
http://wiki.c2b2.columbia.edu/califanolab/index.php/BCellInteractome.htm
A network of protein-protein, protein-DNA and modulatory interactions in human B cells. The network contains known interactions (reported in public databases) and predicted interactions by a Bayesian evidence integration framework which integrates a variety of generic and context specific experimental clues about protein-protein and protein-DNA interactions with inferences from different reverse engineering algorithms, such as GeneWays and ARACNE. Modulatory interactions are predicted by the MINDY, an algorithm for the prediction of modulators of transcriptional interactions (please refer to the publication section for more information). The BCI can be downloaded as one tab delimited file containing the complete network (BCI.txt) with each type of interaction explicitly defined.
Proper citation: B Cell Interactome (RRID:SCR_008655) Copy
http://www.ch.embnet.org/software/COILS_form.html
COILS is a program that compares a sequence to a database of known parallel two-stranded coiled-coils and derives a similarity score. By comparing this score to the distribution of scores in globular and coiled-coil proteins, the program then calculates the probability that the sequence will adopt a coiled-coil conformation.
Proper citation: COILS: Prediction of Coiled Coil Regions in Proteins (RRID:SCR_008440) Copy
http://lincsportal.ccs.miami.edu/dcic-portal/
Portal which provides a unified interface for searching LINCS dataset packages and reagents. Users can use the portal to access datasets, small molecules, cells, genes, proteins and peptides, and antibodies.
Proper citation: LINCS Data Portal (RRID:SCR_014939) Copy
A database of genomic and protein data for Drosophila site-specific transcription factors.
Proper citation: FlyTF.org (RRID:SCR_004123) Copy
http://life.ccs.miami.edu/life/
LIFE search engine contains data generated from LINCS Pilot Phase, to integrate LINCS content leveraging semantic knowledge model and common LINCS metadata standards. LIFE makes LINCS content discoverable and includes aggregate results linked to Harvard Medical School and Broad Institute and other LINCS centers, who provide more information including experimental conditions and raw data. Please visit LINCS Data Portal.
Proper citation: LINCS Information Framework (RRID:SCR_003937) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.