What is in samtools?

2021-06-07

What is in samtools?

Samtools is a set of utilities that manipulate alignments in the SAM (Sequence Alignment/Map), BAM, and CRAM formats. It converts between the formats, does sorting, merging and indexing, and can retrieve reads in any regions swiftly. Samtools is designed to work on a stream.

Does SAMtools merge sort?

Merge multiple sorted alignment files, producing a single sorted output file that contains all the input records and maintains the existing sort order. The output file can be specified via -o as shown in the first synopsis.

How long does SAMtools sort take?

We compared the sorting speed of a 25Gb unsorted BAM file with SAMtools and sambamba. Our results show that sambamba was 2x faster than SAMtools. The following violin plot shows that SAMtools took 20 minutes while sambamba could sort the same file in 10 minutes.

What is SAMtools bioinformatics?

SAMtools is a library and software package for parsing and manipulating alignments in the SAM/BAM format. It is able to convert from other alignment formats, sort and merge alignments, remove PCR duplicates, generate per-position information in the pileup format (Fig.

How much memory does samtools sort use?

While running SAMtools, we provisioned only 45 Gb (1.5 Gb for each of the 30 threads) so one should only specify 80-90% of available memory to SAMtools. Sambamba used close to the 45 Gb memory we specified for the first 5 minutes before dropping the memory used to 2Gb.

What do SAM files look like?

The SAM format consists of a header and an alignment section. The binary equivalent of a SAM file is a Binary Alignment Map (BAM) file, which stores the same data in a compressed binary representation. SAM files can be analysed and edited with the software SAMtools.

What is samtools bioinformatics?

Is SAMtools multithreaded?

There are at least two different multi-threading implementations in Samtools. Heng Li’s and Nils Homer’s. The latter appears to be more efficient as it multi-threads decoding too, but it’s less clear how to control it to just, say, 500% cpu (as it’ll use the same number of threads encoding as decoding).