High-quality assemblies aren’t just for bacteria. “We’ve shown recently that we can assemble a fungal genome of 20 or 30 megabases with four or eight SMRT Cells and get only 10 or 20 contigs — which often represents the number of chromosomes in the genome,” Montpetit says.
Read the full case study to learn more about how the Innovation Centre has deployed SMRT Sequencing, their shift from hybrid to PacBio-only assemblies, and how they differentiate their bioinformatics analysis service. (For link Go to PACB website,click on Blogs)
At the center, SMRT Sequencing has been used in diverse research areas. Some examples include generation of high-quality assemblies in microbial sequencing, analysis of long, repetitive genomic regions, and sequencing of full-length human gene isoforms. Microbial sequencing encompasses a number of applications, including biotech industry efforts to improve microbial biofermentation and microbiome studies, from environmental remediation projects on Alberta tar sands to veterinary research on microbes present in cattle rumen.
In the two years they’ve been running the SMRT Sequencing platform, the Innovation Centre scientists have seen remarkable progress in what they have been able to achieve. Continued improvements in read lengths — partly due to new reagent kits from PacBio and partly due to more streamlined sample prep protocols developed at the center — have already made a major difference.
One major step was achieving complete bacterial sequencing and assembly in less than a day, a feat that may enable the core facility to serve as a rapid response center for organizations that study pathogen outbreaks and other urgent problems. In 2013, tests conducted with researchers at the Canadian Food Inspection Agency and other government agencies demonstrated that the Innovation Centre scientists could sequence a sample and fully assemble the genome and plasmid elements — all in 20 hours or less.
Indeed, the Innovation Centre team is routinely able to deliver affordable, high-quality, finished genomes. “A single bacterial genome, a library prep, and two SMRT Cells of sequencing — which is generally a little bit overkill — is less than $1,000,” Dewar says. “More and more often, we are getting a completely closed, finished-quality genome for that.” (part 2 of 3)
Innovation Centre in Quebec Uses SMRT Sequencing for
Cost-Effective, Complete Microbial Genomes
At the McGill University and Génome Québec Innovation Centre, many projects conducted in the sequencing core facility fall under the umbrella of life sciences rather than biomedical research. To the scientists responsible for making the core facility operate as smoothly as possible, that makes a world of difference.
“When you’re in the life sciences in addition to human biomedical [research], you’re out there in the world of things that haven’t been sequenced before, or haven’t been sequenced particularly well,” says Ken Dewar, a principal investigator at the Innovation Centre.
To navigate this type of uncharted territory, scientists at the center rely on long-read sequencing from their PacBio® RS II platform to cost-effectively close microbial genomes, traverse repeat-heavy genomic regions, and perform full-length transcript sequencing. By leveraging the dramatically increased read lengths PacBio sequencing provides, they have driven down costs and improved completeness of their assemblies.
At the core facility, Alexandre Montpetit is dedicated to running the next-generation sequencing platforms. His primary affiliation is with Génome Québec, and he has an adjunct appointment at McGill. He and his colleagues have been champions of long-read sequencing for years, so when PacBio unveiled its platform with industry-leading read length, it was an obvious choice for the center to adopt the technology.
“We’ve always had a focus on sequencing things for the first time or assembling genomes for the first time, not for the thousand-and-first time,” Dewar says. “PacBio was a natural fit.”
(Part 1 0f 2)
We hypothesized that closed-circular plasmid DNA will not receive the SMRTbell adaptor
molecules that provide a priming site for the sequencing reaction, thereby mitigating loss of
target DNA without contributing significantly to the sequencing output (Picture 1). The plasmid
carriers are inexpensive and can be prepared in bulk. In addition to employing the use of a
plasmid carrier during library construction, we have optimized the conditions of the final
preparation steps of the libraries for sequencing, including sequencing primer annealing,
polymerase binding, and MagBead binding. To maximize potential sequencing yield, we also reuse
the MagBead-bound complex in subsequent sequencing runs. With the use of a circular
plasmid carrier, optimized library preparation conditions, and re-use of the MagBead-bound
complex, we demonstrate this method is capable of producing comparable, unbiased, per-
SMRTcell sequencing yields from 1000-fold less starting material compared to the standard
PacBio library preparation protocols.
Picture 1. Principle of the low-input library preparation method. SMRTbell adaptors will ligate
to linear DNA inserts of interest, but not to closed-circular plasmid DNA that is added as a
carrier to the sample.
Materials and Methods
The 2kb Low-Input Template Preparation and Sequencing protocol can be found on the
Pacific Biosciences website in the Shared Protocols section of the SMRT Community Sample
Adaptor ligation No adaptor ligation
Polished linear DNA Closed circular DNA
Here we describe how this method can be used to produce sequencing yields comparable to those generated from standard input amounts, but by using 1000-fold less starting material. (Click on Preview PDF) ---------------- ntroduction
In just the last few years, the development of second-generation sequencing (SGS) and
third-generation sequencing (TGS) platforms, and the applications they enable, has driven the
development of genomics,
For Link to this go to IHUB MB,--- This
requirement for relatively large amounts of starting material can be a significant impediment to
the sequencing of samples with limited amounts of DNA such as needle biopsy material, forensic
or ChIP-seq samples, microorganisms refractory to growth in synthetic media, or when searching
for rare sequence variants in unamplified nucleic acid samples. This is especially true when
preparing unamplified libraries for single-molecule sequencing using the PacBio RS II
sequencer. Unlike all SGS technologies, which rely on PCR and/or clonal amplification of DNA
to generate thousands of copies of each template molecule for sequencing, PacBio library
preparation does not require amplification of the DNA template during library preparation.(7)
A previously described method utilized the PacBio RS sequencer for direct sequencing
from as little as one nanogram of input DNA.(2) The method employs the use of random hexamer
primers to anneal to the template DNA to provide the binding sites for the PacBio polymerase,
thereby bypassing library preparation altogether. The method was applied to sequencing of
ssDNA and dsDNA small genomes, with sequencing yields of mapped reads from less than a
hundred to a few thousand per SMRT Cell. With the decrease in time and cost associated with
library preparation, this method may be well-suited for rapid identification of infectious disease
agents. However, because the sequencing yield produced is only a few thousand reads per
SMRT Cell, application to larger or more complex genomes may be limited. Here we describe a
simple, amplification-free method capable of producing standard sequencing yields by utilizing
closed-circular plasmid DNA as a ‘carrier’ to minimize sample loss during library preparation.
We hypothesized that closed-circular plasmid DNA will not receive the SMRTbell adaptor
March 26, 2014
(Lot more on IHUB)!!!
PacBio didn’t die. It may never threaten the dominance of Illumina (NASDAQ: ILMN) in genomics, and may never become profitable. But in Hunkapiller’s do-what-you-say-you’re-going-to-do-and-grind-it-out way, PacBio has improved. Its instrument now has a niche. Illumina is miles ahead on the factors that count most for customers—speed, cost, and sequencing throughput (bandwidth). But PacBio is making a name for its high-accuracy genomes, its ability to detect structural genetic variations (like RNA transcripts) that other tools can’t, and for creating high-quality genomes of small organisms like bacteria, viruses, and worms. Last fall, PacBio’s stock surged when it struck a deal with Roche to develop technology for the lucrative market to come in genomic diagnostics, where some of PacBio’s technical advantages might be more highly valued.
“It’s going to be very, very hard for anybody to take on Illumina,” says Keith Robison, a computational biologist with Cambridge, MA-based Warp Drive Bio, who uses the various sequencing instruments and writes the Omics! Omics! blog. “But PacBio is a pioneer in finding applications that don’t work well on Illumina.” If you are a scientist working in one of those areas, this is a big deal. Robison adds: “In my world, the microbial world, we want high-quality genomes, and PacBio is almost the only game in town.”
(It may take another 2-3 years,but the big bang will happen)!! GLTA
Wednesday, March 19, 2014 PacBio Blog--- Assessment of Highly Complex Alternative Splicing of Neurexins Performed with SMRT Sequencing
A new paper in the Proceedings of the National Academy of Sciences from the laboratories of Stephen R. Quake and Thomas C. Südhof (both at Stanford University) describes the direct, full-length transcript sequencing of RNA molecules that are essential to synapse formation in the mammalian brain. The team used Single Molecule, Real-Time (SMRT®) Sequencing to analyze full-length mRNAs from different members of the neurexin gene family and used that information to examine alternative splicing events.
For this study, researchers used the PacBio® platform to sequence transcripts generated by three neurexin genes in adult mice. “Read lengths of up to 30 kb enabled us to identify all of the splice combinations within a single transcript,” they report. With sequencing reads representing more than 25,000 full-length mRNAs, the team made several important discoveries. These include: a novel alternatively spliced exon; even higher isoform diversity than was anticipated; and the finding that splicing events seem to occur independently of one another. The team was able to map out the full transcript landscape for a neurexin gene, showing alternative splicing at all six canonical sites as well as at several noncanonical sites.
Being able to directly assess alternative splicing not only provided evidence for suspected isoform diversity, but also revealed “that neurexins are likely even more polymorphic than previously thought,” the team reports. Based on their observations from SMRT Sequencing, they calculated how many neurexin variants were possible in total. “We observed in this manner a minimal diversity of 1,159 isoforms for Nrxn1a, 1,120 isoforms for Nrxn3a, and a total of 152 isoforms for all three ß-neurexins,” they write. “Thus, earlier estimates of 2,000–3,000 neurexin (For full story go to PACB website,click on blogs).
A New Gold Standard for Accuracy in NGS: Mike Hunkapiller, PacBio
published by Ayanna Monteverdi on Wed, 02/26/2014 - 10:24
Mike Hunkapiller, CEO, Pacific Biosciences
Bio and Contact Info
Listen (4:58) What is the theme for 2014 at PacBio?
Listen (2:50) Are you working on a clinical sequencer?
Listen (6:55) What are your thoughts on regulation and diagnostics?
Listen (3:12) What was your reaction to the Oxford Nanopore data just released at AGBT?
Listen (6:40) PacBio runs becoming the gold standard in microbial sequencing
Mike Hunkapiller, the CEO of Pacific Biosciences, joins us again this year as part of our annual series on NGS. Last year, Mike stressed the importance of PacBio's SMRT(TM) sequencing to do longer reads than the competition--namely Illumina. He says PacBio will continue to stay focused on further improving read length and accuracy this year as well. In fact, he says that the PacBio technology is becoming "the new gold standard" for microbial sequencing.
What does Mike think of the first data released by Oxford Nanopore recently? And what are PacBio's plans for clinical sequencing? Join us in the second installment of NGS 2014.
(GO TO IHUB FOR LINK TO LISTEN)
Posted by Biome on 25th February 2014
Pacific Biosciences’ RS II machine, the only single-molecule sequencer on the market, does not really compete with Illumina or Ion in terms of throughput; its P5-C3 chemistry produces only 375Mb of sequence per run. The real strength of the RS II is its long reads: the average read being 8.5Kb, with the longest being in excess of 30Kb. Recently published read correction strategies remove many of the errors, and now the SMRT technology of Pacific Biosciences (or ‘PacBio’) seems the weapon of choice for finishing genomes or de novo sequencing of new genomes."
The future for Illumina’s competitors
It will be really interesting to see how Life Technologies responds to Illumina’s latest developments. Their key advantage is speed, with the Ion Torrent platforms carrying out the sequencing component in hours rather than days. However, the throughput and cost-per-base do not match current Illumina platforms, never mind the new ones. To remain a viable business, Life Technologies, and its Ion Torrent platforms, must respond.
Pacific Biosciences’ SMRT technology has evolved significantly too and has become an essential tool for those wishing to close genomes, or sequence de novo new genomes. Intriguingly, Roche, a global health-care company, announced an agreement with Pacific Biosciences to develop DNA sequencing products for clinical diagnostics. This is not a space that Pacific Biosciences have been in up until now, and it is difficult to see how their RS II system can compete with Illumina and Ion Torrent in the clinic. Because of this, rumors of a new (benchtop?) PacBio machine abound on social media.
(more on IHUB)
.Published on Feb 18, 2014
Edwin Hauw from Pacific Biosciences presents the PacBio® technology roadmap for 2014. What to expect: sample prep improvements for low-input DNA; a new chemistry for longer reads; assemblers for low coverage or diploid genomes; analysis tools for isoform sequencing, viral minor variant detection, and long-amplicon haplotype analysis; and much more.
(go to IHUB PACB MB for youtube link)
Kevin Davies @KevinADavies ·16 hrs ago
Isaac Ro (Goldman Sachs) declares @PacBio a winner #AGBT14: User feedback positive and a new, smaller/cheaper instrument could soon emerge Retweeted
Gene Myers, who recently joined the Max Planck Institute for Molecular Cell Biology and Genetics in Dresden, Germany, said that PacBio long reads had reinvigorated his excitement about genome assembly with the promise of being able to produce reference-quality genomes. Myers has developed a tool called Dazzler (the Dresden Azzembler) that significantly accelerates the process of assembling PacBio sequence data. Dazzler works by scrubbing data prior to assembly in order to make the entire process more efficient; Myers reported a comparison of the human genome data set we just released showing a 36-fold speedup over BLASR. The tool can fully ///////////// CSO Jonas Korlach gave a talk showcasing the Iso-Seq™ method for full isoform characterization using SMRT Sequencing. He showed papers from the laboratories of Mike Snyder and Wing Wong, both at Stanford, who used PacBio long reads to fully analyze transcriptomes. Even in well-studied cell lines, Korlach noted, scientists were finding novel transcript isoforms and even novel genes thanks to information provided in these long reads. He also spoke about a metagenomics project looking at a mock human microbiome data set from NIAID, in which SMRT Sequencing was able to fully resolve more than half of the organisms in the community and get the rest into assemblies of a few contigs. The project also resolved all plasmids and yielded methylome data for the microbiome.
The evening session on genomic technologies development featured two more PacBio users. David Wheeler from Baylor’s Human Genome Sequencing Center presented sequence data for tumor/normal pairs; his group is generating 10x coverage for tumors and 5x for the matched normal tissue. He focused on structural rearrangements such as tandem duplications and said that many of these elements were driven by the movement of repeat regions around the genome. They could clearly be resolved using the PacBio technology (see PACBIO BLOGS)For full story.
2014 The New York Times - By ANNE EISENBERGFEB. 8, 2014 -------The Path to Reading a Newborn’s DNA Map
What if laboratories could run comprehensive DNA tests on infants at birth, spotting important variations in their genomes that might indicate future medical problems? Should parents be told of each variation, even if any risk is still unclear? Would they even want to know?
New parents needn’t confront these difficult questions just yet. The more than four million babies born in 2014 in the United States will likely be screened in traditional ways — by public health programs that check for sickle cell anemia and several dozen other serious, treatable conditions. So far, DNA-based tests of infants play only a small part in screening.
But that may change in the next few years, as technology that can sequence and analyze the entire genome of a child becomes available, potentially detecting a range of inherited genetic conditions at birth. It’s the same type of analysis that now can tell adults — if they choose to ask for it — whether they are at high risk for a certain type of cancer, for example. As the technology becomes more sophisticated, it will inevitably expand into the world of newborThe first human genome was decoded only a decade ago, but genomics has already become a multibillion-dollar industry. If screening of newborns gains traction, established genomics companies that make DNA testing and analysis tools — companies like Illumina and Pacific Biosciences — may be beneficiaries.
Dr. Powell of the University of North Carolina wants the pilot projects to establish research data before consumer-directed programs offering comprehensive genome sequencing and analysis become widespread.
A version of this article appears in print on February 9, 2014, on page BU3 of the New York edition with the headline: The Path to Reading a Newborn’s DNA Map. Order Reprints|
(more on IHUB)
Wednesday, February 12, 2014-Data Release: ~54x Long-Read Coverage for PacBio-only De Novo Human Genome Assembly first word cloud of tweets.
("54x", "Data", "Human", "Genome" are the buzz words. )(
(for links go to IHUB)
"PacBio also claimed that the use of PacBio long-read data to create de novo assemblies of human genomes using Hierarchical Genome Assembly Process (HGAP) in collaboration with Google has resulted in a 3.25 Gb assembly with a contig N50 of 4.38 Mb, and with the longest contig being 44 Mb. Compare this to total assembly size of 2.83 Gb and a contig N50 of 144 kb from the most recent reference-guided assembly using Illumina and BAC-clone finishing on the same sample" (Again," N50 of 4.38 Mega bites, and with the longest contig being 44 Mega bites.!!) and Compare this to total assembly size of 2.83 Gb and a contig N50 of 144 kilo bites from the most recent reference-guided assembly using Illumina !!!
Sentiment: Strong Buy
IN The Berkeley Drosophila Genome Project, the PacBio-only
assembly is a huge improvement over the reference genome, which is currently in its fifth iteration.
Researchers involved in the Berkeley Drosophila Genome Project have spent over 10 years working on
the reference genome using a combination of Sanger sequencing, BAC clones, and other manual and
labor-intensive approaches. Yet, using just one next-gen sequencing technology, and over just six weeks,
the PacBio technology was able to piece together regions that have proved particularly troublesome, like
heterochromatin and the Y chromosome, she said.
"There's been some persistent repeats that we couldn't get through, that [PacBio] did," "Having those very long reads allows you to get through large arrays of repeats."
Short-read sequencing technology is valuable for applications like identifying genes or fragments of
genes, and enables many genomes to be sequenced cost-effectively — but it doesn't give you the longrange
architecture, Bergman said.
The PacBio-only assembly also has some advantages over the hybrid PacBio/Illumina assembly,
One problem with error correction, he said, is that Illumina technology does not sequence well through
repetitive regions, so the Illumina-corrected reads in those repetitive regions are not as good. "You don't
really get the gain in the regions of the genome where you need them for the long-range assemblies," he
Genome Biology, estimating a cost of about $1,000
for de novo sequencing and assembly of microbes with PacBio technology. Additionally, the researchers
compared self-correction to hybrid correction and found that self-correction was often better in terms of
accuracy and contiguity.
Phillippy said that he expects these conclusions for microbial genomes to carry over to larger genomes,
especially as throughput and read lengths continue to increase, and the Drosophila genome is the first
evidence of that.
(Lots more on IHUB)
Publication date Jan 23, 2014 --The standard of sequencing accuracy was set to 99.99% by the National Human Genome Research Institute (NHGRI) in 1998. While a single base-call for each position in a template may not achieve such accuracy, - While it is not necessary to assemble the sequence of an entire genome using such stringent requirements. The ALLORA assembler from Pacific Biosciences, Menlo Park, Calif., can use reads that only have 70% identity between each other), it remains preferable to construct inputs whose overlap can be detected with high identity before passing them to a third party assembler. Moreover, when there are repeats in a genome, it is also favorable to generate input that can clearly distinguish the different repeats. Finally, it is also preferable that some artifacts, e.g., chimeric reads and high quality region identification errors, due to sequencing reactions, can be filtered out before the assembly step.
While it is not necessary to assemble the sequence of an entire genome using such stringent requirements, (e.g., the ALLORA assembler from Pacific Biosciences, Menlo Park, Calif., can use reads that only have 70% identity between each other), it remains preferable to construct inputs whose overlap can be detected with high identity before passing them to a third party assembler. Moreover, when there are repeats in a genome, it is also favorable to generate input that can clearly distinguish the different repeats. Finally, it is also preferable that some artifacts, e.g., chimeric reads and high quality region identification errors, due to sequencing reactions, can be filtered out before the assembly step.
In other words, SMRT® sequencing produces not only shorter fragments but also a number of longer ones. An alignment algorithm (e.g., as implemented in a program such as BLASR, from Pacific Biosciences, Menlo Park, Calif.) can be used to align all the reads to a longer read, thereby creating a mini-assembly for each long read.
(for link & full story,go To IHUB)
From USDA’s Agricultural Research Service, molecular biologist Sean Gordon discussed the need for long readsequencing to map an organism’s transcriptome. His team analyzed the wood-decaying fungus Plicaturopsis crispa first with short reads and found that they were missing exons and other important information. “There is no path from short reads to accurate isoforms,” he said. They switched to SMRT Sequencing so they could observe, rather than infer, full-length transcripts. Gordon showed one particular gene to illustrate the success of the approach: with short-read sequencing, this gene was predicted to have six isoforms; with PacBio, the team observed and confirmed 118 isoforms instead. He also noted that generating a transcriptome from PacBio data does not require a reference genome. His team did have a reference for P. crispa, however, which they used to double-check the PacBio results and found them to be highly accurate. Gordon said that the long reads also enabled unexpected findings, such as abundant read-through transcription, in which multiple ORFs occurred in a transcript. (The recording is not available at this time.)
(for full story go to PACB website,click on BLOG)
("We now see the technology as the future of genome sequencing. “I don’t know why PacBio didn’t do this sooner said Gordon Gecko of the Satanic Investment Bank." )
Posted on January 29, 2014----------------------------------------///Pacific Biosciences, the American DNA sequencing company, have announced typically predictable plans for their platform, the Pac Bio RS II.
The company announced on Tuesday that they would focus on “more reads, longer reads, and higher quality data”. Of course, this has been the goal of all sequencing technologies since the 1970s, and many researchers expressed surprise that PacBio seemed to have been following a different strategy up until now.
“With our unique 15% error rate, our aim was to produce a technology that would serve only a small set of niche markets”, sources at the company said. “Therefore, we produced just a few thousand really poor quality, relatively short reads” they continued.
All of this changed in 2012, when a new CEO and Chief Scientific Office joined the company, and introduced the decades old stratgey of actually producing something useful.
“We came in and we took a look at the data, and we asked what the strategy was” said Michael Caterpiller, CEO. “The board thought that their really #$%$ error rate gave them a unique niche in the market, and that niche needed to be defended. Some in the company actually thought we should increase the error rate, and blow everyone else out of the market in terms of poor quality data” he continued.
“So we thought – why not introduce the strategy of every other sequencing company and increase throughput, read length and quality?” continued Jonas Coolback, chief scientific officer. “It was a revolutionary idea – all of a sudden, we’d made PacBio relevant again. We started producing what researchers had wanted since the very beginning”. (for full story go to IHUB for link)