Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
http://yetfasco.ccbr.utoronto.ca/
Collection of all available transcription factor (TF) specificities for the yeast Saccharomyces cerevisiae in Position Frequency Matrix (PFM) or Position Weight Matrix (PWM) formats. The specificities are evaluated for quality using several metrics. With this website, you can scan sequences with the motifs to find where potential binding sites lie, inspect precomputed genome-wide binding sites, find which TFs have similar motifs to one you have found, and download the collection of motifs. Submissions are welcome.
Proper citation: YeTFaSCo (RRID:SCR_006893) Copy
Encyclopedia of DNA elements consisting of list of functional elements in human genome, including elements that act at protein and RNA levels, and regulatory elements that control cells and circumstances in which gene is active. Enables scientific and medical communities to interpret role of human genome in biology and disease. Provides identification of common cell types to facilitate integrative analysis and new experimental technologies based on high-throughput sequencing. Genome Browser containing ENCODE and Epigenomics Roadmap data. Data are available for entire human genome.
Proper citation: ENCODE (RRID:SCR_006793) Copy
http://bioinformatics.albany.edu/~dmaps
THIS RESOURCE IS NO LONGER IN SERVCE, documented September 6, 2016. DMAPS database contains pre-computed multiple structure alignments for protein chains in the Protein Data Bank (PDB). Automated structure alignments have been generated for classified protein families using CE-MC algorithm. Alignments have been built only for those families with at least three members. Currently, multiple structure alignments are available for 3050 SCOP-, 3087 CATH-, 664 ENZYME- and 1707 CE-based families. Users will be able to retrieve multiple alignments for a given PDB chain classified by one of these criteria.
Proper citation: DMAPS - A Database of Multiple Alignments for Protein Structures (RRID:SCR_007140) Copy
Re-annotated gene expression / proteomics data from GEO by relating all probe IDs to Entrez Gene IDs once every three months, enabling you to find data from GEO, and compare them from different platforms and species. Platform Annotations adds the latest annotations to any uploaded probe / gene ID list file. Platform Comparison compares any two platforms to find corresponding probes mapping to the same gene. Cross-species mapping maps platform annotations to other species. Gene Search finds deposited platforms and samples in GEO that contain a list of genes. GPL ID Search finds the GPL ID (GEO platform ID) for your array. You can also download the latest annotations files for all arrays and their comprehensive universal gene identifier table, which relates all types of gene / protein / clone identifiers to Entrez Gene IDs for all species. Note: The database was last updated on 4/30/2011. They have successfully mapped 54932732 individual probes from 385099 GEO samples measuring 3519 GEO platforms across 217 species.
Proper citation: Array Information Library Universal Navigator (RRID:SCR_006967) Copy
http://goblet.molgen.mpg.de/cgi-bin/goblet2008/goblet.cgi
Tool that performs annotation based on GO and pathway terms for anonymous cDNA or protein sequences. It uses the species independent GO structure and vocabulary together with a series of protein databases collected from various sites, to perform a detailed GO annotation by sequence similarity searches. The sensitivity and the reference protein sets can be selected by the user. GOblet runs automatically and is available as a public service on our web server. GOblet expects query sequences to be in FASTA-Format (with header-lines). Protein and nucleotide sequences are accepted. Total size of all sequences submitted per request should not be larger than 50kb currently. For security reasons: Larger post's will be rejected. Due to limited capacities the queries may be processed in batches depending on the server load. The output of the BLAST job is filtered automatically and the relevant hits are displayed. In addition, the respective GO-terms are shown together with the complete GO-hierarchy of parent terms., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: GOblet (RRID:SCR_006998) Copy
http://sites.huji.ac.il/malaria/
Data set of metabolic pathways for the malaria parasite based on the present knowledge of parasite biochemistry and on pathways known to occur in other unicellular eukaryotes. This site extracted the pertinent information from the universal sites and presented them in an educative and informative format. The site also includes, cell-cell interactions (cytoadherence and rosetting), invasion of the erythrocyte by the parasite and transport functions. It also contains an artistic impression of the ultrastructural morphology of the interaerythrocytic cycle stages and some details about the morphology of mitochondria and the apicoplast. Most pathways are relevant to the erythrocytic phase of the parasite cycle. All maps were checked for the presence of enzyme-coding genes as they are officially annotated in the Plasmodium genome (http://plasmodb.org/). The site is constructed in a hierarchical pattern that permits logical deepening: * Grouped pathways of major chemical components or biological process ** Specific pathways or specific process *** Chemical structures of substrates and products or process **** Names of enzymes and their genes or components of process Each map is linked to other maps thus enabling to verify the origin of a substrate or the fate of a product. Clicking on the EC number that appears next to each enzyme, connects the site to BRENDA, SWISSPROT ExPASy ENZYME, PlasmoDB and to IUBMB reaction scheme. Clicking of the name of a metabolite, connects the site to KEGG thus providing its chemical structure and formula. Next to each enzyme there is a pie that depicts the stage-dependent transcription of the enzyme''s coding gene. The pie is constructed as a clock of the 48 hours of the parasite cycle, where red signifies over-transcription and green, under-transcription. Clicking on the pie links to the DeRisi/UCSF transcriptome database.
Proper citation: Malaria Parasite Metabolic Pathways (RRID:SCR_007072) Copy
This project encompasses development of novel biological network analysis methods and infrastructure for querying biological data in a semantically-enabled format, and aims to create a semantic interactome model. Research within the BioMANTA project will focus on computational modelling and analysis, primarily using Semantic Web technologies and Machine Learning methods, of large-scale protein-protein interaction and compound activity networks across a wide variety of species. A range of information such as kinetic activity, tissue expression, and subcellular localization and disease state attributes will be included in the resulting data model. Protein interactions are a fundamental component of biological processes. Many proteins are functional only in multimeric complexes, or require interaction partners to achieve their correct localisation or function. For this reason, the study of protein-protein interaction (PPI) networks has become an area of growing interest in computational biology. Through the use of Semantic Web technologies such as Resource Description Framework (RDF) and Web Ontology Language (OWL), interaction data is modelled to create a knowledge representation in which meaning is vested in the ontology rather than instances of data. Stochastic and computational intelligence methods are applied to this data to infer high coverage networks. Semantic inferencing is used to infer previously unknown and meaningful pathways. Major project components: - The BioMANTA Ontology:- An OWL DL ontology incorporating the PSI-MI Ontology, the NCBI Taxonomy, and elements of BioPax ontology and Gene Ontology (describing subcellular localisation). This allows us to re-use existing ontologies, thereby reducing overheads associated with knowledge acquisition in the ontology development process. We are able to integrate existing public data that contain annotation in these formats. - Data conversion & semantic protein integration:- A set of software components that convert protein-protein databases (DIP, MPact, IntAct, etc.) from PSI-MI XML to RDF compliant with the BioMANTA ontology. These software allow us to make these protein-protein interaction datasets (and more generally, any PSI-MI XML data) semantically available for querying and inference within BioMANTA. - A RDF triple store based on RDF Molecules and the MapReduce architecture:- A proof-of-concept RDF triple store using RDF molecules and Hadoop scale-out architectures. Regular RDF graphs are deconstructed into RDF molecules, which are distributed over distributed compute nodes in the MapReduce architecture, and are subsequently combined to form equivalent RDF graphs. Such an approach makes the distributed SPARQL querying and reasoning on RDF triple stores possible. - A quantitative framework to integrate networks extracted from independent data sources (gene expression, subcellular localization, and ortholog mapping):- The model is multi-layer, with a first layer based on Decision Trees where each Decision tree is built on each dataset independently. The tree nodes are cut using Shannon''s entropy (mutual information); the decision of these independent trees is integrated using logistic regression, and the parameters are optimised using maximum likelihood. Sponsors: This resource is supported by the Pfizer Global Research and Development, the Institute for Molecular Bioscience (IMB), and the University of Queensland, Australia.
Proper citation: BioMANTA (RRID:SCR_007177) Copy
Center that acquires, maintains, and distributes genetic stocks and information about stocks of the small free-living nematode Caenorhabditis elegans for use by investigators initiating or continuing research on this genetic model organism. A searchable strain database, general information about C. elegans, and links to key Web sites of use to scientists, including WormBase, WormAtlas, and WormBook are available.
Proper citation: Caenorhabditis Genetics Center (RRID:SCR_007341) Copy
This service offers a gateway to well-benchmarked protein structure and function prediction methods. Structural models collected from the prediction servers are assessed using the powerful 3D-jury consensus approach. The Structure Prediction Meta Server provides access to various fold recognition, function prediction and local structure prediction methods. The Server takes the amino acid sequence of the query protein, the reference name for the prediction job, and the E-mail address as input. The E-mail address is used only for notification about errors during the execution of the job. The query sequence and the reference name are placed in the process queue. The Meta Server accepts only sequences, which have not been submitted before. In case of duplicate sequences the second user will be notified with a link to the previous submission. Sequences longer than 800 amino acids are not accepted by some services. The internal SQL database offers the possibility to find any previous jobs processed by the Meta Server using regular expressions addressing field like E-mail, Job Name and the host name, from which the job was initiated. Each server has its own process queuing system managed by the Meta Server. All results of fold recognition servers are translated into uniform formats. The information extracted from the raw output of the servers includes the PDB codes of the hits, the alignments and the similarity (reliability) scores specific for every server. Mapping of the hits to the SCOP and FSSP classifications are made either using known PDB representatives or alignment of the template sequence with the databases of proteins in both classifications. The secondary structure assignments for all hits are taken from the mapped FSSP (red for helices and blue for strands). Underscored amino acids indicate the first residue after an insertion in the template sequence. The Meta server provides translation of the alignments in standard formats like FASTA, PDB or CASP. The Meta Server is coupled to consensus servers. They provide jury predictions based on the results collected from other services. Not all fold recognition servers are used by the jury system. The data stored on the meta server is available through http://meta.bioinfo.pl/data/JOBID/. Jobs older than 2 months are not shown. The Meta Server is only a set of programs aimed to process and manage biological data, while the predictive power of the service comes from (mostly) remote prediction providers. Sponsors: This resource is supported by The BioInfoBank Institute.
Proper citation: BioInfoBank Meta Server (RRID:SCR_007181) Copy
Portal for Macromolecular X-Ray Crystallography to produce and support an integrated suite of programs that allows researchers to determine macromolecular structures by X-ray crystallography, and other biophysical techniques. Used in the education and training of scientists in experimental structural biology for determination and analysis of protein structure.
Proper citation: CCP4 (RRID:SCR_007255) Copy
Software package created to perform molecular dynamics. Molecular dynamics package mainly designed for simulations of proteins, lipids, and nucleic acids. Can also be used for research on non-biological systems, such as polymers.
Proper citation: GROMACS (RRID:SCR_014565) Copy
http://sing.ei.uvigo.es/ALTER/
Web application to perform program-oriented conversion of DNA and protein alignments and transform between multiple sequence alignment formats. ALTER focuses on the specifications of mainstream alignment and analysis programs rather than on the conversion among more or less specific formats.
Proper citation: ALTER (RRID:SCR_015968) Copy
http://bioinfo.eie.polyu.edu.hk/mGoaSvmServer/mGOASVM.html
Data analysis service for the prediction of multi-label protein subcellular localization based on gene ontology and support vector machines. Web services are also available.
Proper citation: mGOASVM (RRID:SCR_013098) Copy
http://genetics.bwh.harvard.edu/pph2/
Software tool which predicts possible impact of amino acid substitution on structure and function of human protein using straightforward physical and comparative considerations. PolyPhen-2 is new development of PolyPhen tool for annotating coding nonsynonymous SNPs.
Proper citation: PolyPhen: Polymorphism Phenotyping (RRID:SCR_013189) Copy
http://sift.bii.a-star.edu.sg/
Data analysis service to predict whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids. SIFT can be applied to naturally occurring nonsynonymous polymorphisms and laboratory-induced missense mutations. (entry from Genetic Analysis Software) Web service is also available.
Proper citation: SIFT (RRID:SCR_012813) Copy
http://xin.cz3.nus.edu.sg/group/admeap/admeap.asp
A database for facilitating the search for drug Absorption, Distribution, Metabolism, Excretion (ADME) associated proteins. It contains information about known drug ADME associated proteins, functions, similarities, substrates / ligands, tissue distributions, and other properties of the targets. Associated references are also included. Drug absorption, distribution, metabolism and excretion (ADME) often involve interaction of a drug with specific proteins. Knowledge about these ADME-associated proteins is important in facilitating the study of the molecular mechanism of disposition and individual response as well as therapeutic action of drugs. It is also useful in the development and testing of pharmacokinetics prediction tools. Several databases describing specific classes of ADME-associated proteins have appeared. A new database, ADME-associated proteins (ADME-AP), is introduced to provide comprehensive information about all classes of ADME-associated proteins described in the literature including physiological function of each protein, pharmacokinetic effect, ADME classification, direction and driving force of disposition, location and tissue distribution, substrates, synonyms, gene name and protein availability in other species. Cross-links to other databases are also provided to facilitate the access of information about the sequence, 3D structure, function, polymorphisms, genetic disorders, nomenclature, ligand binding properties and related literatures of each protein. ADME-AP currently contains entries for 321 proteins and 964 substrates. ADME Class Based on their respective role of pharmacokinetics, ADME-associated proteins can be classified into four categories: A: This Category includes proteins involved in the absorption or re-absorption of drugs into systemic system. D: This category includes proteins responsible for facilitating the distribution of drugs from the systemic system to the target sites or away from the target sites back to the systemic system. Certain plasma proteins and intracellular binding proteins may alter free drug concentration by acting as drug storage depot. These proteins thus play a regulatory role in drug distribution and they are thus included in Category D. Based on their role in drug distribution, proteins in this category can be further divided into three groups D1, D2, and D3. The first group D1 includes transporters capable of transporting chemicals across membranes of various tissue barriers from the systemic system into the target sites. Blood-brain barrier and placenta barrier are examples of tissue barrier. Proteins in the second group D2 are responsible for transporting drugs back into the systemic system. Proteins in the third group D3 mainly function as drug storage depot. These include ligand binding proteins in plasma and intracellular proteins. M: Proteins in category M are drug-metabolizing enzymes. These enzymes can be further divided into two separate groups M1 and M2, according to whether the corresponding enzymatic reaction is phase I or phase II. E: This category E includes proteins that enable the excretion or presystemic elimination of drugs. Some proteins belong to more than one category: e.g. P-glycoprotein both limits intestinal absorption and excludes drugs from the brain back to the blood. It thus belongs to both Category E and D. For those proteins capable of transporting natural substrates without literature report of interaction with a drug, a postfix potential is attached to their respective classification to indicate that their specific role in ADME is yet to be confirmed. Use of ADME-AP for commercial purposes is not allowed.
Proper citation: Drug ADME Associated Protein Database (RRID:SCR_013501) Copy
A protein family specific platform that works closely with the GPCR community to determine the high resolution structure and function of GPCRs. Structures are available in the glutamate, secretin, frizzled/TAS2, adhesion, and rhodopsin branches of the protein phylogenetic tree. Users can access a list of protein structure targets and completed protein structures.
Proper citation: GPCR Network (RRID:SCR_014286) Copy
http://www.matrixscience.com/server.html
A software package and server used to identify and characterize proteins from primary sequence databases using mass spectrometry data. Mascot integrates peptide mass fingerprinting, sequence querying, and MS/MS ion searching in order to search for proteins in databases like SwissProt, NCBInr, EMBL EST divisions, contaminants, and cRAP. If a license is purchased, users may: search data sets that exceed the 1200 spectrum limit of the free version; set up automated, high throughput work; add and edit proteins and quantification methods; and search a preferred collection of sequence databases. The software package works with instruments from AB Sciex, Agilent, Bruker, Jeol, Shimadzu, Thermo Scientific, and Waters.
Proper citation: Mascot (RRID:SCR_014322) Copy
http://edwardslab.bmcb.georgetown.edu/ws/peptideMapper/
The PeptideMapper Web-Service provides alignments of peptide sequence alignments to proteins, mRNA, EST, and HTC sequences from Genbank, RefSeq, UniProt, IPI, VEGA, EMBL, and HInvDb. This mapping infrastructure is supported, in part, by the compressed peptide sequence database infrastructure (Edwards, 2007) which enables a fast, suffix-tree based mapping of peptide sequences to gene identifiers and a gene-focused detailed mapping of peptide sequences to source sequence evidence. The PeptideMapper Web-Service can be used interactively or as a web-service using either HTTP or SOAP requests. Results of HTTP requests can be returned in a variety of formats, including XML, JSON, CSV, TSV, or XLS, and in some cases, GFF or BED; results of SOAP requests are returned as SOAP responses. The PeptideMapper Web-Service maps at most 20 peptides with length between 5 and 30 amino-acids in each request. The number of alignments returned, per peptide, gene, and sequence type, is set to 10 by default. The default can be changed on the interactive alignments search form or by using the max web-service parameter.
Proper citation: PeptideMapper (RRID:SCR_005763) Copy
Collect, share, and distribute information about protein three-dimensional structures. It serves as a portal for the scientific community to learn about protein structures solved by SG centers, and also to contribute their expertise in annotating protein function. The premise of the TOPSAN project is that, no matter how much any individual knows about a particular protein, there are other members of the scientific community who know more about certain aspects of the same protein, and that the collective analyses from experts will be far more informative than any local group, let alone individual, could contribute. They believe that, if the members of the biological community are given the opportunity, authorship incentives, and an easy way to contribute their knowledge to the structure annotation, they would do so. Therefore, borrowing elements from successful, distributed, collaborative projects, such as Wikipedia (the free encyclopedia anyone can edit) and from other open source software development projects, TOPSAN will be a broad, collaborative effort to annotate protein structures, initially, those determined at the JCSG. They believe that the annotation of proteins solved by structural genomics consortia offers a unique opportunity to challenge the extant paradigm of how biological data is collected and distributed, and to connect structural genomics and structural biology to the entire biological research community. TOPSAN is designed to be scalable, modular and extensible. Furthermore, it is intended to be immediately useful in a simplistic way and will accommodate incremental improvements to functionality as usage becomes more sophisticated. Their annotation pages will offer the end user a combination of automatically generated as well as expert-curated annotations of protein structures. They will use available technology to increase the speed and granularity of the exchange of scientific ideas, and use incentive mechanisms that will encourage collaborative participation.
Proper citation: TOPSAN (RRID:SCR_005758) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.