Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
SciCrunch Registry is a curated repository of scientific resources, with a focus on biomedical resources, including tools, databases, and core facilities - visit SciCrunch to register your resource.
http://cortexassembler.sourceforge.net/index_cortex_var.html
A tool for genome assembly and variation analysis from sequence data. You can use it to discover and genotype variants on single or multiple haploid or diploid samples. If you have multiple samples, you can use Cortex to look specifically for variants that distinguish one set of samples (eg phenotype=X, cases, parents, tumour) from another set of samples (eg phenotype=Y, controls, child, normal). cortex_var features * Variant discovery by de novo assembly - no reference genome required * Supports multicoloured de Bruijn graphs - have multiple samples loaded into the same graph in different colours, and find variants that distinguish them. * Capable of calling SNPs, indels, inversions, complex variants, small haplotypes * Extremely accurate variant calling - see our paper for base-pair-resolution validation of entire alleles (rather than just breakpoints) of SNPs, indels and complex variants by comparison with fully sequenced (and finished) fosmids - a level of validation beyond that demanded of any other variant caller we are aware of - currently cortex_var is the most accurate variant caller for indels and complex variants. * Capable of aligning a reference genome to a graph and using that to call variants * Support for comparing cases/controls or phenotyped strains * Typical memory use: 1 high coverage human in under 80Gb of RAM, 1000 yeasts in under 64Gb RAM, 10 humans in under 256 Gb RAM
Proper citation: cortex var (RRID:SCR_005081) Copy
http://www.physics.rutgers.edu/~anirvans/SOPRA/
Software tool to exploit the mate pair/paired-end information for assembly of short reads from high throughput sequencing platforms, e.g. Illumina and SOLiD.
Proper citation: SOPRA (RRID:SCR_005035) Copy
http://www.baseclear.com/landingpages/basetools-a-wide-range-of-bioinformatics-solutions/sspacev12/
A stand-alone software program for scaffolding pre-assembled contigs using paired-read data. Main features are: a short runtime, multiple library input of paired-end and/or mate pair datasets and possible contig extension with unmapped sequence reads.
Proper citation: SSPACE (RRID:SCR_005056) Copy
http://meringlab.org/software/hpc-clust/
A set of tools designed to cluster large numbers (>1 million) of pre-aligned nucleotide sequences. It performs the clustering of sequences using the Hierarchical Clustering Algorithm (HCA). There are currently three different cluster metrics implemented: single-linkage, complete-linkage, and average-linkage. In addition, there are currently four sequence distance functions implemented, these are: identity (gap-gap counting as match), nogap (gap-gap being ignored), nogap-single (like nogap, but consecutive gap-nogap''s count as a single mismatch), tamura (distance is calculated with the knowledge that transitions are more likely than transversions). One advantage that HCA has over other algorithms is that instead of producing only the clustering at a given threshold, it produces the set of merges occuring at each threshold. With this approach, the clusters can afterwards very quickly be reported for every arbitrary threshold with little extra computation. This approach also allows the plotting of the variation of number of clusters with clustering threshold without requiring the clustering to be run for each threshold independently. Another feature of the way HPC-CLUST is implemented is that the single-, complete-, and average-linkage clusterings can be computed in a single run with little overhead.
Proper citation: HPC-CLUST (RRID:SCR_005052) Copy
http://plaza.ufl.edu/xywang/Mpick.htm
A modularity-based clustering software for Operational Taxonomic Unit (OTU) picking of 16S rRNA sequences. The algorithm does not require a predetermined cut-off level, and our simulation studies suggest that it is superior to existing methods that require specified distance or variance levels to define OTUs.
Proper citation: M-pick (RRID:SCR_004995) Copy
http://plaza.ufl.edu/sunyijun/ES-Tree.htm
Software for hierarchical Clustering Analysis of Millions of 16S rRNA Pyrosequences in Quasi-linear Time.
Proper citation: ESPRIT-Tree (RRID:SCR_005045) Copy
http://cran.r-project.org/web/packages/MBCluster.Seq/index.html
Software to cluster genes based on Poisson or Negative-Binomial model for RNA-Seq or other digital gene expression (DGE) data.
Proper citation: MBCluster.Seq (RRID:SCR_005079) Copy
http://www.biomedcentral.com/1471-2105/13/189
An algorithm to use optical map information directly within the de Bruijn graph framework to help produce an accurate assembly of a genome that is consistent with the optical map information provided. AGORA takes as input two data structures: OpMap ? an ordered list of fragment sizes representing the optical map; and Edges ? a list of de Bruijn graph edges with their corresponding sequences.
Proper citation: AGORA (RRID:SCR_005070) Copy
https://github.com/AlexeyG/GRASS
A generic algorithm for scaffolding next-generation sequencing assemblies.
Proper citation: GRASS (RRID:SCR_005071) Copy
http://www.bioinf.boku.ac.at/pub/MapAl/
A software tool for RNA-Seq expression profiling that builds on the established programs Bowtie and Cufflinks. Allowing an incorporation of ''gene models'' already at the alignment stage almost doubles the number of transcripts that can be measured reliably.
Proper citation: MapAl (RRID:SCR_004938) Copy
http://sourceforge.net/apps/mediawiki/amos/index.php?title=Bambus2
Software for scaffolding to address some of the challenges encountered when analyzing metagenomes. Scaffolding represents the task of ordering and orienting contigs by incorporating additional information about their relative placement along the genome. While most other scaffolders are closely tied to a specific assembly program, Bambus accepts the output from most current assemblers and provides the user with great flexibility in choosing the scaffolding parameters. In particular, Bambus is able to accept contig linking data other than specified by mate-pairs. Such sources of information include alignment to a reference genome (Bambus can directly use the output of MUMmer), physical mapping data, or information about gene synteny.
Proper citation: Bambus (RRID:SCR_005068) Copy
http://www.comp.hkbu.edu.hk/~chxw/software/G-BLASTN.html
A GPU-accelerated nucleotide alignment tool based on the widely used NCBI-BLAST. It can produce exactly the same results as NCBI-BLAST, and it also has very similar user commands. It also supports a pipeline mode, which can fully utilize the GPU and CPU resources when handling a batch of medium to large sized queries.
Proper citation: G-BLASTN (RRID:SCR_005062) Copy
https://sites.google.com/site/jingyijli/SLIDE.zip
Software package that takes exon boundaries and RNA-Seq data as input to discern the set of mRNA isoforms that are most likely to present in an RNA-Seq sample. It is based on a linear model with a design matrix that models the sampling probability of RNA-Seq reads from different mRNA isoforms. To tackle the model unidentifiability issue, SLIDE uses a modified Lasso procedure for parameter estimation. Compared with deterministic isoform assembly algorithms (e.g., Cufflinks), SLIDE considers the stochastic aspects of RNA-Seq reads in exons from different isoforms and thus has increased power in detecting more novel isoforms. Another advantage of SLIDE is its flexibility of incorporating other transcriptomic data such as RACE, CAGE, and EST into its model to further increase isoform discovery accuracy. SLIDE can also work downstream of other RNA-Seq assembly algorithms to integrate newly discovered genes and exons. Besides isoform discovery, SLIDE sequentially uses the same linear model to estimate the abundance of discovered isoforms.
Proper citation: SLIDE (RRID:SCR_005137) Copy
http://sourceforge.net/projects/viralfusionseq/
A versatile high-throughput sequencing (HTS) tool for discovering viral integration events and reconstruct fusion transcripts at single-base resolution. It combines soft-clipping information, read-pair analysis, and targeted de novo assembly to discover and annotate viral-human fusion events. A simple yet effective empirical statistical model is used to evaluate the quality of fusion breakpoints. Minimal user defined parameters are required.
Proper citation: VFS (RRID:SCR_005138) Copy
https://github.com/tk2/RetroSeq
A tool for discovery and genotyping of transposable element variants (TEVs) (also known as mobile element insertions) from next-gen sequencing reads aligned to a reference genome in BAM format. The goal is to call TEVs that are not present in the reference genome but present in the sample that has been sequenced. It should be noted that RetroSeq can be used to locate any class of viral insertion in any species where whole-genome sequencing data with a suitable reference genome is available. RetroSeq is a two phase process, the first being the read pair discovery phase where discorandant mate pairs are detected and assigned to a TE class (Alu, SINE, LINE, etc.) by using either the annotated TE elements in the reference and/or aligned with Exonerate to the supplied library of viral sequences.
Proper citation: RetroSeq (RRID:SCR_005133) Copy
https://github.com/cwhelan/cloudbreak
Software providing a Hadoop-based genomic structural variation (SV) caller for Illumina paired-end DNA sequencing data. It contains a full pipeline for aligning data in the form of FASTQ files using alignment pipelines that generate many possible mappings for every read, in the Hadoop framework. It then contains Hadoop jobs for computing genomic features from the alignments, and for calling insertion and deletion variants from those features.
Proper citation: Cloudbreak (RRID:SCR_005097) Copy
http://yost.genetics.utah.edu/software.php
A software analysis pipeline for mapping mutations using RNA-seq that works without parental strain information, without the requirement of a pre-existing snp map of the organism, and without erroneous assumptions that recombination occurs at the same frequency across the genome. In addition, it compensates for the considerable amount of noise in RNA-seq datasets and simultaneously identifies the region where the mutation lies and generates a list of putative coding region mutations in the linked genomic segment. MMAPPR can utilize RNA-seq datasets from isolated tissues or whole organisms that are often generated for phenotypic analysis and gene network analysis in novel mutants.
Proper citation: MMAPPR (RRID:SCR_005092) Copy
http://www.omicsoft.com/fusionmap/
An efficient fusion aligner which aligns reads spanning fusion junctions directly to the genome without prior knowledge of potential fusion regions. It detects and characterizes fusion junctions at base-pair resolution. FusionMap can be applied to detect fusion junctions in both single- and paired-end dataset from either gDNA-Seq or RNA-Seq studies. FusionMap runs under both Windows and Linux (requiring MONO) environments. Although it can run on 32 bit machine, it is recommended to run on 64-bit machine with 8GB RAM or more. If you have an ArrayStudio License, you can run the fusion detection easily through its GUI.
Proper citation: FusionMap (RRID:SCR_005242) Copy
http://www.cs.helsinki.fi/en/gsa/traph/
A software tool for transcript identification and quantification with RNA-Seq. The method has a two-fold advantage: on the one hand, it translates the problem as an established one in the field of network flows, which can be solved in polynomial time, with different existing solvers; on the other hand, it is general enough to encompass many of the previous proposals under the least sum of squares model.
Proper citation: Traph (RRID:SCR_005119) Copy
http://www.raetschlab.org/suppl/rquant
Software for quantitative detection of alternative transcripts with RNA-Seq data. The method, based on quadratic programming, estimates biases introduced by experimental settings and is thus a powerful tool to reveal and quantify novel (alternative) transcripts.
Proper citation: rQuant (RRID:SCR_005150) Copy
Can't find your Tool?
We recommend that you click next to the search bar to check some helpful tips on searches and refine your search firstly. Alternatively, please register your tool with the SciCrunch Registry by adding a little information to a web form, logging in will enable users to create a provisional RRID, but it not required to submit.
Welcome to the NIF Resources search. From here you can search through a compilation of resources used by NIF and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that NIF has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on NIF then you can log in from here to get additional features in NIF such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into NIF you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the sources that were queried against in your search that you can investigate further.
Here are the categories present within NIF that you can filter your data on
Here are the subcategories present within this category that you can filter your data on
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.