Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
https://code.google.com/p/simrare/
A stand-alone executable software with user-friendly graphical interface implemented in Python/C++ for rare variant association studies. It is designed as a unified simulation framework to provide an unbiased and easy manner to evaluate association methods, including novel methods, under a broad range of choice of biological contexts. It consists of three modules, variant data simulator, genotype/phenotype generator and association method evaluator. SimRare generates variant data for gene regions using forward-time simulation which incorporates realistic population demographic and evolutionary scenarios. For phenotype data it is capable of generating both case-control and quantitative traits. The phenotypic effects of variants can be detrimental, protective or non-causal. SimRare has a graphical user interface which allows for easy entry of genetic and phenotypic parameters. Simulated data can be written into external files in a standard format. For novel association method implemented in R it can be imported into SimRare, which has been equipped built in functions to evaluate performance of new method and visually compare it with currently available ones in an unbiased manner.
Proper citation: SimRare (RRID:SCR_005226) Copy
https://github.com/ruping/Breakpointer
A fast tool for locating sequence breakpoints from the alignment of single end reads (SE) produced by next generation sequencing (NGS). It adopts a heuristic method in searching for local mapping signatures created by insertion/deletions (indels) or more complex structural variants(SVs). With current NGS single-end sequencing data, the output regions by Breakpoint mainly contain the approximate breakpoints of indels and a limited number of large SVs. Notably, Breakpointer can uncover breakpoints of insertions which are longer than the read length. Breakpointer also can find breakpoints of many variants located in repetitive regions. The regions can be used not only as a extra support for SV predictions by other tools (such as by split-read method), but also can serve as a database for searching variants which might be missed by other tools. Breakpointer is a command line tool that runs under linux system. Breakpointer takes advanage of two local mapping features of single-end reads as a consequence of indel/SVs: 1) non-uniform read distribution (depth skewness) and 2) misalignments at the boundaries of indel/SVs. These features are summarized as breakpoint signature. Breakpointer proceeds in three stages in capturing this signature. It is implemented in C++ and perl. Input is the file or files containing alignments of single-end reads against a reference genome (in .BAM format). Output is the predicted regions containing potential breakpoints of SVs (in .GFF format). To be able to read in .BAM files, Breakpointer requires bamtools API, which users should install beforehand.
Proper citation: Breakpointer (RRID:SCR_005254) Copy
http://www.genoscope.cns.fr/externe/gmorse/
Software aimed at using RNA-Seq short reads to build de novo gene models. First, candidate exons are built directly from the positions of the reads mapped on the genome (without any ab initio assembly of the reads), and all the possible splice junctions between those exons are tested against unmapped reads : the testing of junctions is directed by the information available in the RNA-Seq dataset rather than a priori knowledge about the genome. Exons can thus be chained into stranded gene models.
Proper citation: G-Mo.R-Se (RRID:SCR_005273) Copy
Software mining pipeline guided by a Bayesian principle to detect single nucleotide polymorphisms, insertion and deletions by comparing high-throughput pyrosequencing reads with a reference genome of related organisms. This pipeline is extended to identify and visualize large-size structural variations, including insertions, deletions, inversions and translocations.
Proper citation: inGAP (RRID:SCR_005261) Copy
http://sv.gersteinlab.org/pemer/
Software package as computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Package is composed of three modules, PEMer workflow, SV-Simulation and BreakDB. PEMer workflow is a sensitive software for detecting SVs from paired-end sequence reads. SV-Simulation randomly introduces SVs into a given genome and generates simulated paired-end reads from novel genome.
Proper citation: PEMer (RRID:SCR_005263) Copy
https://code.google.com/p/phantompeakqualtools/
Software package that computes quick but highly informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data. It can also be used to obtain robust estimates of the predominant fragment length or characteristic tag shift values in these assays.
Proper citation: phantompeakqualtools (RRID:SCR_005331) Copy
http://research.cs.wisc.edu/wham/
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 28,2023. High-throughput sequence alignment tool that aligns short DNA sequences (reads) to the whole human genome at a rate of over 1500 million 60bps reads per hour, which is one to two orders of magnitudes faster than the leading state-of-the-art techniques. Feature list for the current version (v 0.1.5) of WHAM: * Supports paired-end reads * Supports up to 5 errores * Supports alignments with gaps * Supports quality scores for filtering invalid alignments, and sorting valid alignments * finds ALL valid alignments * Supports multi-threading * Supports rich reporting modes * Supports SAM format output
Proper citation: WHAM (RRID:SCR_005497) Copy
http://www-personal.umich.edu/~jianghui/seqmap/
A software tool for mapping large amount of oligonucleotide to the genome. It is designed for finding all the places in a genome where an oligonucleotide could potentially come from. SeqMap can efficiently map as many as dozens of millions of short sequences to a genome of several billions of nucleotides. While doing the mapping, several mutations as well as insertions / deletions of the nucleotide bases in the sequences can be tolerated and furthermore detected. Various input and output formats are supported, as well as many command line options for tuning almost every steps in the mapping process. A typical mapping can be done in a few hours on an ordinary PC.
Proper citation: SeqMap (RRID:SCR_005495) Copy
http://www.genome.umd.edu/jellyfish.html
A software tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. JELLYFISH can count k-mers quickly by using an efficient encoding of a hash table and by exploiting the compare-and-swap CPU instruction to increase parallelism. Jellyfish is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in an binary format, which can be translated into a human-readable text format using the jellyfish dump command., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: Jellyfish (RRID:SCR_005491) Copy
http://mrfast.sourceforge.net/
Software designed to map short reads generated with the Illumina platform to reference genome assemblies; in a fast and memory-efficient mannerl. Currently Supported Features: * Output in SAM format * Indels up to 8 bp (4 bp deletions and 4 bp insertions) * Paired-end mapping ** Discordant option to generate mapping file ready for VariationHunter to detect structural variants. * One end anchored (OEA) map locations for novel sequence insertion detection with NovelSeq * Matepair library mapping (long inserts with RF orientation). Planned Features: * Multithreading
Proper citation: mrFAST (RRID:SCR_005487) Copy
THIS RESOURCE IS NO LONGER IN SERVICE. Documented on May 3rd,2023. A software program designed to accurately map sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. By using the posterior probability of mapping a given read to a specific genomic loation, we are able to account for repetitive reads by distributing them across several regions in the genome. In addition, the output of the program is created in such a way that it can be easily viewed through other free and readily- available programs. Several benchmark data sets were created with spiked-in duplicate regions, and GNUMAP was able to more accurately account for these duplicate regions.
Proper citation: GNUMAP (RRID:SCR_005482) Copy
http://samstat.sourceforge.net/
C software program for displaying sequence statistics for next generation sequencing. Works with large fasta, fastq and SAM/BAM files.
Proper citation: SAMStat (RRID:SCR_005432) Copy
http://dna.leeds.ac.uk/methylviewer/
A simple integrated software tool for handling MAP (methyltransferase accessibility protocol) and MAP-IT (MAP individual templates) footprinting projects. It can process sequence data (*.txt, *.ab1 and *.scf) derived from the use of up to four different DNA methyltransferases.
Proper citation: MethylViewer (RRID:SCR_005448) Copy
http://epigenome.usc.edu/publicationdata/bissnp2011/
A software package based on the Genome Analysis Toolkit (GATK) map-reduce framework for genotyping and accurate DNA methylation calling in bisulfite treated massively parallel sequencing (Bisulfite-seq, NOMe-seq, RRBS and any other bisulfite treated sequencing) with Illumina directional library protocol. It contains the following key features: * Call and summarize methylation of any cytosine context provided (CpG, CHH, CHG, GCH et.al.); * Work for single end and paired-end data; * Accurtae variant detection. Enable base quality recalibration and indel calling in bisulfite sequencing; * Based on Java map-reduce framework, allow multi-thread computing. Cross-platform; * Allow multiple output format, detailed VCF files, CpG haplotype reads file for mono-allelic methylation analysis, simplified bedGraph, wig and bed format for visualization in UCSC genome broswer and IGV browser. BisSNP uses bayesian inference with locus specific methylation probabilities and bisulfite conversion rate of different cytosine context(not only CpG, CHH, CHG in Bisulfite-seq, but also GCH et.al. in other bisulfite treated sequencing) to determine genotypes and methylation levels simultaneously., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Proper citation: Bis-SNP (RRID:SCR_005439) Copy
http://code.google.com/p/distmap/
A user-friendly software pipeline designed to map short reads in a MapReduce framework on a local Hadoop cluster. It is designed to be easily implemented by researchers who do not have expert knowledge of bioinformatics. As it does not have any dependencies, it provides full flexibility and control to the user. The user can use any version of a compatible mapper and any reference genome assembly. There is no need to maintain the mapper, reference or DistMap source code on each of the slaves (nodes) in the Hadoop cluster, making maintenance extremely easy.
Proper citation: DistMap (RRID:SCR_005473) Copy
http://www.well.ox.ac.uk/project-stampy
A software package for the mapping of short reads from illumina sequencing machines onto a reference genome. It''s recommended for most workflows, including those for genomic resequencing, RNA-Seq and Chip-seq. Stampy excels in the mapping of reads containing that contain sequence variation relative to the reference, in particular for those containing insertions or deletions. It can map reads from a highly divergent species to a reference genome for instance. Stampy achieves high sensitivity and speed by using a fast hashing algorithm and a detailed statistical model. Stampy has the following features: * Maps single, paired-end and mate pair Illumina reads to a reference genome * Fast: about 20 Gbase per hour in hybrid mode (using BWA) * Low memory footprint: 2.7 Gb shared memory for a 3Gbase genome * High sensitivity for indels and divergent reads, up to 10-15% * Low mapping bias for reads with SNPs * Well calibrated mapping quality scores * Input: Fastq and Fasta; gzipped or plain * Output: SAM, Maq''s map file * Optionally calculates per-base alignment posteriors * Optionally processes part of the input * Handles reads of up to 4500 bases
Proper citation: Stampy (RRID:SCR_005504) Copy
http://ngsview.sourceforge.net/
A generally applicable, flexible and extensible next-generation sequence alignment editor. The software allows for visualization and manipulation of millions of sequences simultaneously on a desktop computer, through a graphical interface.
Proper citation: NGSView (RRID:SCR_005637) Copy
http://www.bioinformatics.babraham.ac.uk/projects/hicup/
A tool for mapping and performing quality control on Hi-C data.
Proper citation: HiCUP (RRID:SCR_005569) Copy
An alignment, junction calling, and feature quantification pipeline specifically designed for Illumina RNA-Seq data.
Proper citation: RUM (RRID:SCR_008818) Copy
https://github.com/armintoepfer/QuasiRecomb/releases
A jumping hidden Markov model that describes the generation of the viral quasispecies and a method to infer its parameters by analysing next generation sequencing data.
Proper citation: QuasiRecomb (RRID:SCR_008812) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.