Found a link to the sequencing 99 Ebola virus genomes .(go to IHUB M.B.) ("The team sequenced 99 Ebola virus genomes from 78 people in Sierra Leone, who were diagnosed with Ebola in late May and mid-June. Sadly, five of over 50 co-authors of the paper lost their lives to Ebola virus before the paper was published.") !!
After Release of 20 New Genomes, 100K Pathogen Project Now Kicking PacBio Sequencing into Higher Gear
July 30, 2013
By Molika Ashford
... "He said PacBio sequencing has been "fantastic" for the team so far, yielding "nice
.... reports on the sequencing of 99 Ebola virus genomes from infected patients in
.... Get back to me when there's mobile sequence analysis."
(sorry article from genomwb)To access $$$
Friday, October 10, 2014-- ASHG 2014: A New Look at the Human Genome with Long-Read Sequencing
Scientists around the world are getting ready for the annual meeting of the American Society for Human Genetics taking place October 18-22 at the San Diego Convention Center. We’re looking forward to a number of excellent presentations and posters, and are delighted to see that many of them will focus on applying Single Molecule, Real-Time (SMRT®) Sequencing to human studies.
(for link go to PACB website,click on blogs)
On short-read versus long-read sequencing
Short-read sequencing technologies still maintain the advantage in terms of throughput, says Schadt, but there are a variety of important genomic features that cannot be characterized without long-read sequencing, such as long tandem repeats, bigger structural variations, and focal variants important in cancer.
“I definitely think [short-read] technologies were tuned for certain problems and had certain advantages that enabled this big advance, but they are absolutely not hitting the entire problem like we need it hit,” he told Mendelspod.
Cancer is a main area of study for which Schadt believes long-read sequencing is needed, in order to understand the complicated genomic features driving the tumor cells. And outside of human applications he called out plant genomics. “Plant genomes are so complicated and so flooded with repeat sequences, their only hope is to have long-read data,” he said.
The quality of PacBio sequencing is 'beyond compare'
Schadt noted that early misconceptions about the type of error profiles seen in single-molecule data erroneously led people to believe the data was of lower quality. He explained how the errors are random and can easily be washed out with a modest amount of coverage, whereas other next-generation sequencing technologies have systemic errors that cannot be removed.
At Mount Sinai they used PacBio® technology to sequence the human genome and saw “very dramatic improvements in the quality of the de novo assemblies, revealing features that have never been seen before.” He said he believes this type of sequencing will become the standard. “The quality of that PacBio data is just beyond compare.”
For full story (go to PACB website,click on blogs)
This is the best PART of the article."PacBio data is “really high quality” and “as good or better than Illumina and Sanger,” (Go to PACB website,click on blogs)!!
The challenge with short reads Geraghty explained that sequencing fosmids with short-read technology is cumbersome when it comes to stitching together the reads. Data analysis and finishing “became a roadblock that the Illumina short-read technology wouldn’t let us get beyond,” he said, noting that the finishing process takes 30 minutes to an hour per fosmid, prohibitive for any modest-scale effort. Geraghty marveled that he has received 40 kb reads from PacBio – meaning a whole fosmid can be sequenced in one piece.
PacBio is ready to handle the challenge
Geraghty said that with recent technology improvements, PacBio data is “really high quality” and “as good or better than Illumina and Sanger,” noting that his group has compared all three technologies with the same sequences. “It opens up a whole new possibility,” he said, because previously “you simply weren’t getting all of the data. People were using statistics to impute missing data and so on, and it simply doesn’t work.”
Should PacBio be used for all major sequencing projects?
Geraghty thinks so, noting that a resource such as the 1000 Genomes Project would be upgraded significantly with PacBio data for complex regions such as MHC and KIR. He said that if you look at these regions in the 1000 Genomes data you will find “a mass of confusion” because those regions are highly repetitive and contain a large amount of copy number and allelic variation, making it difficult or impossible to assemble the data correctly with short reads.
“Any large human genome sequencing projects just using short-read technology are not going to acquire usable data for these complex regions, it’s as simple as that,” he said. For complex regions, “you’ll need long-read data,” he said, “The long-read data will give you really what everybody has been after all along without realizing it. It will give you the phase and the detail on the polymorphism in these highly polymorphic regions.”
The future is bright (PACBIO Blogs)
Breakthrough study discovers six changing faces of ‘global killer’ bacteria
Issued by University of Leicester Press Office on 30 September 2014
"Every ten seconds a human being dies from pneumococcus infection making it the leading cause of serious illness across the globe." The University of Adelaide and scientists from Pacific Biosciences, and has for the first time shown a genetic switch that allows this bacterium to randomly change its characteristics into six alternative states.
"Now that scientists have determined the methylation profiles with the PacBio® platform, it should be possible for other scientists to accurately assign the pathogen to its specific phase. “Future studies must recognize the potential for switching between these heretofore undetectable, differentiated pneumococcal subpopulations in vitro and in vivo,” the authors note. “We believe these findings represent a new paradigm in gene regulation in bacteria and therefore are of great significance to the infectious disease field.”
(Go to PACB web,click on blogs)also IHUB for another link.
Monday, September 22, 2014-- Maryland Scientists Produce High-Quality, Cost-Effective Genome Assembly of Loa loa Roundworm Using SMRT Sequencing
A paper just released in BMC Genomics details what authors call “the most complete filarial
nematode assembly published thus far at a fraction of the cost of previous efforts.” The project was performed using the PacBio® RS II DNA Sequencing System by scientists at the University of Maryland School of Medicine’s Institute for Genome Sciences and the Laboratory of Parasitic Diseases at the National Institute of Allergy and Infectious Diseases. A comparison of short-read sequence data, short- and long-read hybrid data, and long-read-only data found that PacBio data used on its own outperformed other assemblies that included short-read sequence. The final assembly was produced with HGAP2 and polished with Quiver. It includes 96.4 Mbp in 2,250 contigs and covers about 9% more of the genome than a previous draft assembly — in 85% fewer contigs and starting with 80% less DNA, the authors note.
(for link go to IHUB PACB M.B.)
This class has been running since 2010-11, on the following principle:
¦autumn semester: isolate bacterial DNA, sequence using Illumina, assemble;
¦spring semester: close assembly gaps, annotate genome.
As anyone following genomics knows, the times they are a’changing again and again, so this is less and less state-of-the-art. So we have decided to try a new course plan this year, taking advantage of the progress in bacterial genome sequencing with long PacBio reads.
Our new principle is, hopefully:
¦autumn semester: isolate bacterial DNA, sequence using PacBio, assembly trivial, annotate genome;
¦spring semester: RNA-seq under 2 growth conditions, experiments, Illumina sequencing and bioinformatics analysis.
“Hopefully” because PacBio on bacteria is not yet routine, depending on the genus and the growth conditions. We are thus trying two different bacteria, a Pseudomonas which has a cool story for the RNA-seq part, and a Caulobacter which has been shown to work with PacBio. Preliminary studies on the Pseudomonas are somewhat discouraging for the PacBio sequencing, but we will still try, with adaptations of the protocol. We will also keep the possibility to reverting to Illumina sequencing plus assembly, but we would like to avoid that (if Caulobacter is plan B, this is plan C).
And of course, we have never done RNA-seq with master students, so this year will be a new adventure, comparable to our first course in 2010. Stressful and exciting.
This entry was posted in course plan, sequencing. Bookmark the permalink.
? Our first student genome paper is out: Miyazaki et al Environ MicrobiolOne Response to A new adventure: PacBio sequencing and RNA-seq in the classroom
Winship Herr says:
September 18, 2014 at 14:31
This is a super cool course. My first Master First-step project was to sequence 100 nucleotides of the lac operon
With Illumina, the dominant player in the NGS market, claiming this year that they’ve reached that target with their HiSeq X Ten system, it’s fair to stop and ask just what has been achieved. What do you get for that $1,000? And furthermore, where does NGS go from here?
Beginning next week, we're launching a new series, The Rise of Long Read Sequencing.
I first heard “long read” sequencing differentiated from “short read” in an interview with Mike Hunkapiller, CEO of Pacific Biosciences last year. I had asked him the obvious question about how he expects to compete with Illumina, and he responded saying that “short read technologies” had serious draw backs.
“Wait a minute,” I remember thinking at the time, “did Mike just dismiss Illumina’s technology out right? And what are these long reads he’s talking about.”
(Very interesting article)!! For link,Go to IHUB PACB M.B.
Illumina and Ion Torrent technologies have read lengths up to a few hundred base pairs, while Sanger sequencing covers several hundred. In contrast, Pacific Biosciences’ technology has average reads of about 8,500 bases. Some users have reached tens of thousands of bases. Its RS II system costs about $700,000.
Pacific Biosciences’ single-molecule real-time sequencing is a sequencing-by-synthesis approach that doesn’t use an amplified set of DNA fragments and doesn’t require stopping and starting the reaction to add reagents and image results. Reactions on individual DNA molecules are tracked in real time across 150,000 nanoscale wells where isolated polymerases read the DNA and incorporate fluorescently tagged nucleotides. Because detection occurs only at the bottom of the wells, the background noise from the other reactions is reduced.
Stability of the sequencing process depends in large part on the polymerase. Pacific Biosciences has modified a simple bacteriophage enzyme, slowing it down so that it incorporates about three bases per second and its detector can keep up. To prevent inadvertent photo damage that could stop the process, the company has put a protective scaffold on the enzyme.
Although fast and cheap sequencing will yield much useful knowledge, it has come at a price because of the shorter read lengths, Korlach argues. Pacific Biosciences “wanted to build a technology first and foremost that gives the highest quality of sequence information,” he says.
The 10-year-old company launched its first sequencer in 2011 and has since improved its chemistry, detection, and throughput. On target for 70% sales growth this year, to about $47 million, Pacific Biosciences has installed more than 100 systems and has a market share of a few percent. Its business has seen “a nice boost as the platform (For link go to IHUB)
Posted August 14, 2014-- New Results
Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing
Konstantin Berlin, Sergey Koren, Chen-Shan Chin, James Drake, Jane M Landolin, Adam M Phillippy
We report reference-grade de novo assemblies of four model organisms and the human genome from single-molecule, real-time (SMRT) sequencing. Long-read SMRT sequencing is routinely used to finish microbial genomes, but the available assembly methods have not scaled well to larger genomes. Here we introduce the MinHash Alignment Process (MHAP) for efficient overlapping of noisy, long reads using probabilistic, locality-sensitive hashing. Together with Celera Assembler, MHAP was used to reconstruct the genomes of Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, Drosophila melanogaster, and human from high-coverage SMRT sequencing. The resulting assemblies include fully resolved chromosome arms and close persistent gaps in these important reference genomes, including heterochromatic and telomeric transition sequences. For D. melanogaster, MHAP achieved a 600-fold speedup relative to prior methods and a cloud computing cost of a few hundred dollars. These results demonstrate that single-molecule sequencing alone can produce near-complete eukaryotic genomes at modest cost.
The most recent version of this article [btu392] was published on 2014-07-17 (Published by Oxford University Press)---proovread: large-scale high accuracy PacBio correction through iterative short read consensus
Motivation: Today, the base code of DNA is mostly determined through sequencing by synthesis as provided by the Illumina sequencers. Although highly accurate, resulting reads are short, making their analyses challenging. Recently, a new technology, Single Molecule Real-Time (SMRT) sequencing, was developed which could address these challenges as it generates reads of several thousand bases. But, their broad application has been hampered by a high error rate. Therefore, hybrid approaches which use high quality short reads to correct erroneous SMRT long reads have been developed. Still, current implementations have great demands on hardware, work only in well-defined computing infrastructures and reject a substantial amount of reads. This limits their usability considerably, especially in the case of large sequencing projects.
Results: Here we present proovread, a hybrid correction pipeline for SMRT reads, which can be flexibly adapted on existing hardware and infrastructure from a laptop to a high performance computing cluster. On genomic and transcriptomic test cases covering Escherichia coli, Arabidopsis thaliana and human, proovread achieved accuracies up to 99:9% and outperformed the existing hybrid correction programs. Furthermore, proovread corrected sequences were longer and the throughput was higher. Thus, proovread combines the most accurate correction results with an excellent adaptability to the available hardware. It will therefore increase the applicability and value of SMRT sequencing.
FREE: Workshop, Lunch & Tour of Arizona Genomics Institute to see PacBio® RSII!!!
Rare Opportunity - Register Now - Seating is Limited!
Thursday, August 14th, 2014
BIO5 Institute - Keating Bldg, Rm 103
Sequencing with Long Reads: New and Upcoming Applications of PacBio® SMRT® Technology
Jonas Korlach, Chief Scientific Officer, Pacific BioSciences
Targeted PacBio® Sequencing: BAC Libraries, Physical Maps, Platinum Sequencing
Rod A. Wing, Director, Arizona Genomic Institute (AGI)
Microbial Genome Sequencing
David Baltrus, Assistant Professor, School of Plant Sciences & Microbial Sciences
Sequencing Large Eukaryotic Genomes
Yeisco Yu, Sequencing Group Leader, Arizona Genomics Institute
Dave Kudrna, BAC/EST Resource & Physical Mapping Center Group Leader
Full-Length Transcript Sequencing with PacBio® Iso-Seq™ Method: Going Beyond Short Read Assembly
Jonas Korlach, Chief Scientific Officer, Pacific BioSciences