Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
Non profit bioscience research organization in Seattle, Washington dedicated to accelerating research globally and sharing that data within the science community. Allen Institute for Brain Science, Allen Institute for Cell Science, Allen Institute for Immunology, and The Paul G. Allen Frontiers Group are four divisions of this Institute with commitment to open science model within its research institutes.
Proper citation: Allen Institute (RRID:SCR_005435) Copy
THIS RESOURCE IS NO LONGER IN SERVICE, documented August 29, 2016. An algorithm that finds articles most relevant to a genetic sequence. In the genomic era, researchers often want to know more information about a biological sequence by retrieving its related articles. However, there is no available tool yet to achieve conveniently this goal. Here, a new literature-mining tool MedBlast is developed, which uses natural language processing techniques, to retrieve the related articles of a given sequence. An online server of this program is also provided. The genome sequencing projects generate such a large amount of data every day that many molecular biologists often encounter some sequences that they know nothing about. Literature is usually the principal resource of such information. It is relatively easy to mine the articles cited by the sequence annotation; however, it is a difficult task to retrieve those relevant articles without direct citation relationship. The related articles are those described in the given sequence (gene/protein), or its redundant sequences, or the close homologs in various species. They can be divided into two classes: direct references, which include those either cited by the sequence annotation or citing the sequence in its text; indirect references, those which contain gene symbols of the given sequence. A few additional issues make the task even more complicated: (1) symbols may have aliases; and (2) one sequence may have a couple of relatives that we want to take into account too, which include redundant (e.g. protein and gene sequences) and close homologs. Here the issues are addressed by the development of the software MedBlast, which can retrieve the related articles of the given sequence automatically. MedBlast uses BLAST to extend homology relationships, precompiled species-specific thesauruses, a useful semantics technique in natural language processing (NLP), to extend alias relationship, and EUtilities toolset to search and retrieve corresponding articles of each sequence from PubMed. MedBlast take a sequence in FASTA format as input. The program first uses BLAST to search the GenBank nucleic acid and protein non-redundant (nr) databases, to extend to those homologous and corresponding nucleic acid and protein sequences. Users can input the BLAST results directly, but it is recommended to input the result of both protein and nucleic acid nr databases. The hits with low e-values are chosen as the relatives because the low similarity hits often do not contain specific information. Very long sequences, e.g. 100k, which are usually genomic sequences, are discarded too, for they do not contain specific direct references. User can adjust these parameters to meet their own needs.
Proper citation: MedBlast (RRID:SCR_008202) Copy
https://wiki.med.harvard.edu/SysBio/Megason/GoFigure
GoFigure is a software platform for quantitating complex 4d in vivo microscopy based data in high-throughput at the level of the cell. A prime goal of GoFigure is the automatic segmentation of nuclei and cell membranes and in temporally tracking them across cell migration and division to create cell lineages. GoFigure v2.0 is a major new release of our software package for quantitative analysis of image data. The research focuses on analyzing cells in intact, whole zebrafish embryos using 4d (xyzt) imaging which tends to make automatic segmentation more difficult than with 2d or 2d+time imaging of cells in culture. This resource has developed an automatic segmentation pipeline that includes ICA based channel unmixing, membrane nuclear channel subtraction, Gaussian correlation, shape models, and level set based variational active contours. GoFigure was designed to meet the challenging requirements of in toto imaging. In toto imaging is a technology that we are developing in which we seek to track all the cell movements and divisions that form structures during embryonic development of zebrafish and to quantitate protein expression and localization on top of this digital lineage. For in toto imaging, GoFigure uses zebrafish embryos in which the nuclei and cell membranes have been marked with 2 different color fluorescent proteins to allow cells to be segmented and tracked. A transgenic line in a third color can be used to mark protein expression and localization using a genetic approach that this resource developed called FlipTraps or using traditional transgenic approaches. Embryos are imaged using confocal or 2-photon microscopy to capture high-resolution xyzt image sets used for cell tracking. The GoFigure GUI will provide many tools for visualization and analysis of bioimages. Since fully automatic segmentation of cells is never perfect, GoFigure will provide easy to use tools for semi-automatically and manually adding, deleting, and editing traces in 2d (figures-xy, xz, or yz), 3d (meshes- xyz), 4d (tracks- xyzt) and 4d+cell division (lineages). GoFigure will also provide a number of views into complex image data sets including 3d XYZ and XYT image views, tabular list views of traces, histograms, and scattergrams. Importantly, all these views will be linked together to allow the user to explore their data from multiple angles. Data will be easily sorted and color-coded in many ways to explore correlations in higher dimensional data. The GoFigure architecture is designed to allow additional segmentation, visualization, and analysis filters to be plugged in. Sponsors: GoFigure is developed by Harvard University., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: Harvard Medical School, Department of Systems Biology: The Megason Lab -GoFigure Software (RRID:SCR_008037) Copy
http://connectomics.org/viewer
Extensible, scriptable, pythonic software tool for visualization and analysis in structural neuroimaging research on many spatial scales. Employing the Connectome File Format, diverse data such as networks, surfaces, volumes, tracks and metadata are handled and integrated. The field of Connectomics research benefits from recent advances in structural neuroimaging technologies on all spatial scales. The need for software tools to visualize and analyze the emerging data is urgent. The ConnectomeViewer application was developed to meet the needs of basic and clinical neuroscientists, as well as complex network scientists, providing an integrative, extensible platform to visualize and analyze Connectomics data. With the Connectome File Format, interlinking different datatypes such as hierarchical networks, surface data, volumetric data is easy and might provide new ways of analyzing and interacting with data. Furthermore, ConnectomeViewer readily integrates with: * ConnectomeWiki: a semantic knowledge base representing connectomics data at a mesoscale level across various species, allowing easy access to relevant literature and databases. * ConnectomeDatabase: a repository to store and disseminate Connectome files.
Proper citation: ConnectomeViewer: Multi-Modal Multi-Level Network Visualization and Analysis (RRID:SCR_008312) Copy
The ARCHER project is built upon the prototype software developed by the DART (http://dart.edu.au) and ARROW (http://arrow.edu.au) projects to produce a robust set of software tools. These tools: - may be customised to suit the needs of diverse research areas - automate the collection and management of instrument generated data - enable the repository storage of research data and associated metadata - enable collection and tagging of research data in a collaborative environment, and - provide these capabilities in a secure end-to-end proces. :ARCHER developed a ''production-ready'' software tools, operating in a secure environment, to assist researchers to: - collect, capture and retain large data sets from a range of different sources including scientific instruments - deposit data files and data sets to eResearch storage repositories - populate these eResearch data repositories with associated metadata - permit data set annotation and discussion in a collaborative environment, and - support next-generation methods for research publication, dissemination and access.
Proper citation: Australian ResearCH Enabling enviRonment (RRID:SCR_008390) Copy
http://www.affymetrix.com/support/developer/powertools/apt_archive.affx
Affymetrix Power Tools (APT) are a set of cross-platform command line programs that implement algorithms for analyzing and working with Affymetrix GeneChip arrays. APT programs are intended for power users who prefer programs that can be utilized in scripting environments and are sophisticated enough to handle the complexity of extra features and functionality. APT provides platform for developing and deploying new algorithms without waiting for the GUI implementations. This resource is supported by Affymetrix, Inc.
Proper citation: Affymetrix Power Tools (RRID:SCR_008401) Copy
THIS RESOURCE IS NO LONGER IN SERVICE, documented August 23, 2016. Map improvement server that returns a bias minimized, 6-fold averaged map generated from a model and diffraction data (with optional preceding Molecular Replacement). It does not build or repair the model for you (yet). For automated model building, you need to install a local copy of CCP4 and ARP/wARP (aka wARP&Trace), RESOLVE, MAID, or TEXTAL.
Proper citation: TB Consortium Bias Removal Server (RRID:SCR_008425) Copy
http://www.broad.mit.edu/cancer/software/genecluster2/gc2.html
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 17, 2013. A software package for analyzing gene expression and other bioarray data, giving users a variety of methods to build and evaluate class predictors, visualize marker lists, cluster data and validate results. GeneCluster 2.0 greatly expands the data analysis capabilities of GeneCluster 1.0 by adding supervised classification, gene selection, class discovery and permutation test methods. It includes algorithms for building and testing supervised models using weighted voting (WV) and k-nearest neighbor (KNN) algorithms, a module for systematically finding and evaluating clustering via self-organizing maps, and modules for marker gene selection and heat map visualization that allow users to view and sort samples and genes by many criteria. It enhances the clustering capabilities of GeneCluster 1.0 by adding a module for batch SOM clustering, and also includes a marker gene finder based on a KNN analysis and a visualization module. GeneCluster 2.0 is a stand-alone Java application and runs on any platform that supports the Java Runtime Environment version 1.3.1 or greater.
Proper citation: GeneCluster 2: An Advanced Toolset for Bioarray Analysis (RRID:SCR_008446) Copy
http://www.biobankcentral.org/resource/wwibb.php
THIS RESOURCE IS NO LONGER IN SERVICE, documented on March 27, 2013. Web-based portal to connect all the constituencies in the global biobank community. The project seeks to increase the transparency and accessibility of the scientific research process by connecting researchers with an additional source of funding - microinvestments received from the broader online community. In exchange for these public investments, researchers will maintain research logs detailing the play-by-play progress made in their project, as well as publishing all of their data in a public database under a science commons license. These research projects, in turn, will serve to continually update a research-based neuroscience-based human brain & body curriculum. Biobanks are the meeting point of two major information trends in biomedical research: the generation of huge amounts of genomic and other laboratory data, and the electronic capture and integration of patient clinical records. They are comprised of large numbers of human biospecimens supplemented with clinical data. Biobanks when implemented effectively can harness the power of both genomic and clinical data and serve as a critical bridge between basic and applied research, linking laboratory to patient and getting to cures faster. As science and technology leaders work to address the many challenges facing U.S. biobanks logistical, technical, ethical, financial, intellectual property, and IT BioBank Central will serve as an accurate and timely source of knowledge and news about biorepositories and their role in research and drug development. The Web site also provides a working group venue, patient and public education programs, and a forum for international collaboration and harmonization of best practices.
Proper citation: BioBank Central (RRID:SCR_008645) Copy
http://rgd.mcw.edu/rgdCuration/?module=portal&func=show&name=nuro
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on May 12,2023. Portal that provides researchers with easy access to data on rat genes, QTLs, strain models, biological processes and pathways related to neurological diseases. This resource also includes dynamic data analysis tools.
Proper citation: Rat Genome Database: Neurological Disease Portal (RRID:SCR_008685) Copy
Software tool for data sharing, incorporating blogs and spatial registration of data. Mainly used in geological data sets. A Virtual Research Environment (VRE) aims to combine the capabilities of two existing technologies that have already seen wide adoption among scientists: - The Godiva2 data visualization system provides a means for scientists to browse interactively in a ''Google Maps-like'' fashion through large environmental datasets, including numerical model outputs and high-resolution satellite imagery, using only a web browser. - The LabBlog is a web-based blogging tool specifically designed for the practising scientist to record, disseminate and evaluate their research. The Blog can also be used as a collaboration tool that allows secure discussion between colleagues. Although initially designed for the use of laboratory chemists, the LabBlog is being adapted in this project to meet the needs of environmental scientists. The BlogMyData VRE will allow scientists to explore data visually using Godiva2, then make comments about features in the data on a blog. Colleagues will discover these blog entries and offer further information, providing answers to research questions through comments. Through RSS and GeoRSS feeds, colleagues, investigators and other interested parties can be notified of research activity, and scientists can discover hitherto-unknown colleagues working with similar data in similar geographic regions. Sponsors: BlogMyData is a collaboration between the Reading e-Science Centre and the University of Southampton and is one of the JISC VRERI projects.
Proper citation: BlogMyData (RRID:SCR_008697) Copy
http://openii.sourceforge.net/
OpenII (pronounced open-eye-eye) is a freely downloadable, open source information integration (II) tool suite. It includes 1) an extensible, plug-and-play platform for II tools and 2) several tools that assist with common integration tasks, including fully- or semi-automated support in the following scenarios: :- An integration engineer building a data warehouse must determine how diverse component data schemas map to the schema of the warehouse. :- An XML document that conforms to one schema needs to be converted into an equivalent document that conforms to a second (different) schema. :- To support data exchanges, a community needs to create a shared data model based on the models of its members. When a new member joins, the community needs to identify promising data exchange partners, and to what extent its shared model needs to be extended. Similarly, a chief information officer must identify data integration opportunities and make level-of-effort estimates after an acquisition or merger. To support these scenarios, OpenII provides a schema repository into which diverse data models can easily be imported. It also provides tools that 1) assist with identifying semantic correspondences across data models (Harmony), 2) compare a set of data models against a common reference model (Proximity), 3) visually organize a set of data models into clusters of related data models (Affinity), and 4) establish a common data model for a set of inter-related data models (Unity). Why should You use OpenII? Here are some reasons: :- OpenII is the only open-source platform for information integration tools. OpenII and its source code are freely available using the Apache 2.0 license, so you are free to borrow, extend or resell any portions of OpenII. :- The OpenII schema and mapping repository is based on a neutral modeling language. Thus, all of the OpenII tools can be used regardless of the underlying modeling language. For example, Harmony can identify correspondences among an XML schema, a relational database, and an OWL ontology. By comparison, most commercial tools are tied to a particular modeling language. :- OpenII is based on the Eclipse framework. As a result, the environment is already familiar to many programmers. Non-programmers can choose, instead, to use OpenII off-the-shelf without needing to first install Eclipse. :- OpenII is fully extensible. If needed components do not exist, they can be readily added. For example, adding a new importer or exporter is a straightforward task that can be completed in only a few hours. Moreover, each of the tools supports the introduction of new algorithms. And, programmers familiar with the Eclipse environment can add new views with moderate effort. Sponsors: This resource is supported by the MITRE Corporation.
Proper citation: Open Information Integration (RRID:SCR_008699) Copy
THIS RESOURCE IS NO LONGER IN SERVICE, documented on July 16, 2013. The International Observatory on Neuro-Information is the central source of knowledge, research and data on all skills and issues for Neuroscience applied in Information Sciences. It is an initiative of the Documentation Sciences Foundation, from Spain, which aims to gather information, express opinions, prepare documents, make comparative research, support and promote policy-making, evaluate trends, and take other appropriate action relating to the Neuroscience and its application to the Information Sciences (Libraries, Archives, Documentation centers), and how the traditional knowledge of Information Sciences can bring expertise in data visualization and retrieval techniques, records management, quality assurance and usability in Neuroscience. The Observatory may work together, or in agreement with other national or international organizations pursuing similar or compatible aims.
Proper citation: International Observatory on Neuro-Information (RRID:SCR_008690) Copy
https://t1dexchange.org/pages/
Provides access to resources T1D researchers need to conduct clinical studies. Data sets from their clinic registry is openly available, as are new study results. They also offer use of T1D Discovery Tool, which allows users to search different fields from registry data, and T1D Exchange Biobank, which offers specimen types such as serum, plasma, white blood cells, DNA, and RNA.
Proper citation: T1D Exchange (RRID:SCR_014532) Copy
http://factominer.free.fr/index.html
Software R package for multivariate analysis which takes into account different types of data structure. Data can be organized in groups of variable, groups of individuals, or into hierarchy of variables.
Proper citation: FactoMineR (RRID:SCR_014602) Copy
http://www.heka.com/downloads/downloads_main.html#down_tida
A software which is used to acquire physiological data from the HEKA Patch Clamp Amplifiers and HEKA interfaces.
Proper citation: TIDA (RRID:SCR_014582) Copy
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on 11132025. Facility that provides database development and management and bioinformatic network building services by utilizing on-site hardware and software. Three members of the facility are available to assist researchers with advanced bioinformatics and biostatistics analysis to help put data into biological context across various disease areas to create testable hypotheses and understand biology of the process. The bulk of support includes connecting functional genomic data with pathways and networks, connecting gene/protein expression and disease state and consultations on statistical aspects of the research with the team statistician.
Proper citation: Sanford Burnham Prebys Medical Discovery Institute Bioinformatics and Data Management Facility (RRID:SCR_014868) Copy
http://www.oas.samhsa.gov/nsduh.htm
NSDUH is the primary source of statistical information on the use of illegal drugs, alcohol, and tobacco by the U.S. civilian, noninstitutionalized population aged 12 or older. Conducted by the Federal Government since 1971, the survey collects data through face-to-face interviews with a representative sample of the population at the respondent''s place of residence. Correlates in OAS reports include the following: age, gender, pregnancy status, race / ethnicity, education, employment, geographic area, frequency of use, and association with alcohol, tobacco, & illegal drug use. NSDUH collects information from residents of households and noninstitutional group quarters (e.g., shelters, rooming houses, dormitories) and from civilians living on military bases. The survey excludes homeless persons who do not use shelters, military personnel on active duty, and residents of institutional group quarters, such as jails and hospitals. Most of the questions are administered with audio computer-assisted self-interviewing (ACASI). ACASI is designed to provide the respondent with a highly private and confidential mode for responding to questions in order to increase the level of honest reporting of illicit drug use and other sensitive behaviors. Less sensitive items are administered by interviewers using computer-assisted personal interviewing (CAPI). The 2010 NSDUH employed a State-based design with an independent, multistage area probability sample within each State and the District of Columbia. The eight States with the largest population (which together account for about half of the total U.S. population aged 12 or older) were designated as large sample States (California, Florida, Illinois, Michigan, New York, Ohio, Pennsylvania, and Texas) and had a sample size of about 3,600 each. For the remaining 42 States and the District of Columbia, the sample size was about 900 per State. The design oversampled youths and young adults; each State''s sample was approximately equally distributed among three age groups: 12 to 17 years, 18 to 25 years, and 26 years or older.
Proper citation: National Survey on Drug Use and Health (RRID:SCR_007031) Copy
http://physionet.org/physiobank/
Archive of well-characterized digital recordings of physiologic signals and related data for use by the biomedical research community. PhysioBank currently includes databases of multi-parameter cardiopulmonary, neural, and other biomedical signals from healthy subjects and patients with a variety of conditions with major public health implications, including sudden cardiac death, congestive heart failure, epilepsy, gait disorders, sleep apnea, and aging. The PhysioBank Archives now contain over 700 gigabytes of data that may be freely downloaded. PhysioNet is seeking contributions of data sets that can be made freely available in PhysioBank. Contributions of digitized and anonymized (deidentified) physiologic signals and time series of all types are welcome. If you have a data set that may be suitable, please review PhysioNet''s guidelines for contributors and contact them.
Proper citation: Physiobank (RRID:SCR_006949) Copy
This project encompasses development of novel biological network analysis methods and infrastructure for querying biological data in a semantically-enabled format, and aims to create a semantic interactome model. Research within the BioMANTA project will focus on computational modelling and analysis, primarily using Semantic Web technologies and Machine Learning methods, of large-scale protein-protein interaction and compound activity networks across a wide variety of species. A range of information such as kinetic activity, tissue expression, and subcellular localization and disease state attributes will be included in the resulting data model. Protein interactions are a fundamental component of biological processes. Many proteins are functional only in multimeric complexes, or require interaction partners to achieve their correct localisation or function. For this reason, the study of protein-protein interaction (PPI) networks has become an area of growing interest in computational biology. Through the use of Semantic Web technologies such as Resource Description Framework (RDF) and Web Ontology Language (OWL), interaction data is modelled to create a knowledge representation in which meaning is vested in the ontology rather than instances of data. Stochastic and computational intelligence methods are applied to this data to infer high coverage networks. Semantic inferencing is used to infer previously unknown and meaningful pathways. Major project components: - The BioMANTA Ontology:- An OWL DL ontology incorporating the PSI-MI Ontology, the NCBI Taxonomy, and elements of BioPax ontology and Gene Ontology (describing subcellular localisation). This allows us to re-use existing ontologies, thereby reducing overheads associated with knowledge acquisition in the ontology development process. We are able to integrate existing public data that contain annotation in these formats. - Data conversion & semantic protein integration:- A set of software components that convert protein-protein databases (DIP, MPact, IntAct, etc.) from PSI-MI XML to RDF compliant with the BioMANTA ontology. These software allow us to make these protein-protein interaction datasets (and more generally, any PSI-MI XML data) semantically available for querying and inference within BioMANTA. - A RDF triple store based on RDF Molecules and the MapReduce architecture:- A proof-of-concept RDF triple store using RDF molecules and Hadoop scale-out architectures. Regular RDF graphs are deconstructed into RDF molecules, which are distributed over distributed compute nodes in the MapReduce architecture, and are subsequently combined to form equivalent RDF graphs. Such an approach makes the distributed SPARQL querying and reasoning on RDF triple stores possible. - A quantitative framework to integrate networks extracted from independent data sources (gene expression, subcellular localization, and ortholog mapping):- The model is multi-layer, with a first layer based on Decision Trees where each Decision tree is built on each dataset independently. The tree nodes are cut using Shannon''s entropy (mutual information); the decision of these independent trees is integrated using logistic regression, and the parameters are optimised using maximum likelihood. Sponsors: This resource is supported by the Pfizer Global Research and Development, the Institute for Molecular Bioscience (IMB), and the University of Queensland, Australia.
Proper citation: BioMANTA (RRID:SCR_007177) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.