Scientists Assess Error Modes,Find SMRT Sequencing ‘Least Biased’ !!!!
Monday, August 19, 2013Scientists Assess Error Modes in Sequencing Platforms and Find SMRT Sequencing ‘Least Biased’
A paper from scientists at the Broad Institute reports a rigorous study of bias across all major sequencing platforms. In “Characterizing and measuring bias in sequence data,” published in Genome Biology, lead author Michael Ross and his colleagues report that SMRT® Sequencing on the PacBio® sequencer is the “least biased” in coverage of all the technologies studied.
The authors assessed sequences for coverage bias, or uniformity of read distribution, and error bias, or incorrect call at a given position. For coverage bias, they report that PacBio performed best in extreme GC content (both GC-rich and GC-poor) and suggest this may be related to the lack of an amplification step in the sequencing process. Regarding error bias, the scientists describe shifting error rates based on genome sequence; GC-rich or homopolymer regions, for example, tended to change the rate of errors for each platform. “In general, the sequence context dependence of error rates varied considerably from technology to technology,” they write.
Ross et al. note that each platform’s bias rate changes with technology development, but note that at the time of their work, “single-molecule data from Pacific Biosciences” had “the clear edge.”
We note that the single statistic of relative coverage for the GC ≥ 85% motif provided a suitable assay for bias on R. sphaeroides, with Pacific Biosciences scoring 0.87 (best), Illumina 0.60 and Ion Torrent 0.10 (worst), while GC ≥ 75% did not clearly distinguish between Illumina and Pacific Biosciences data. The GC ≤ 10% motif was similarly useful for P. falciparum, with Pacific Biosciences scoring 0.89 (best), Illumina 0.58, and Ion Torrent 0.39 (worst). For these data, the (AT)15 motif also stood out, with Pacific Biosciences at 0.85, Illumina at 0.43, and Ion Torrent at 0.11. Importantly, just these few statistics provided a meaningful readout on the performance of the different technologies.