% | $
Quotes you view appear here for quick access.

Pacific Biosciences of California, Inc. Message Board

  • paulieme60 paulieme60 Jul 11, 2013 4:43 PM Flag


    Case Study June 2013) COD GENOME ASSEMBLY:
    LONG READS OFFER UNIQUE INSIGHT--- Scientists at the University of Oslo’s Centre for Ecological and
    Evolutionary Synthesis (CEES) applied long PacBio® reads to
    a genome that was proving particularly difficult to assemble.
    Today, sequencing problems associated with the Atlantic
    cod genome are a thing of the past — and researchers are
    using their new assembly as the foundation for a major
    resequencing effort that’s just getting started.
    Recent work using multi-kilobase sequence reads
    generated from Single Molecule, Real-Time (SMRT®)
    technology is enabling a dramatically improved genome
    assembly for cod, an economically important fish species.
    In many ways the cod genome seemed like a puzzle that
    might never be fully solved, but the Pacific Biosciences®
    sequencing platform made significant inroads — and
    just in time, as the team of researchers working on cod
    recently received funding to resequence 1,000 more of
    them. Being able to base these new efforts on a reliable
    genome assembly will make future results far more

    SortNewest  |  Oldest  |  Most Replied Expand all replies
    • "The team’s initial attempt at this
      extracted DNA from the same cod
      used for the original study and ran
      mate-pair sequencing with the
      Illumina® platform. That data helped
      explain why the genome was so
      fragmented, though it could not fix
      the problem. The cod sequenced
      was from the wild population, a
      normal diploid fish whose marked
      heterozygosity — sequence
      differences between maternal and
      paternal chromosomes — seemed to
      be causing issues during assembly.
      “Besides the SNPs that you would
      normally expect, we see large
      differences over hundreds of bases —
      sometimes even kilobases — either
      missing from the other chromosome,
      or causing differences in regions
      when we align them,” Nederbragt
      explains. “This confuses assembly
      Another problem for the assembly
      was the presence of many short
      tandem repeats (STRs). “They’re
      so long that they’re longer than
      the Illumina reads,” Nederbragt
      says, noting that these regions are
      challenging for Sanger and 454
      sequencing as well. “We estimate
      that 10 to 20 percent of the gaps are
      flanked, and probably spanned, by
      those sequences.”

      • 1 Reply to paulieme60
      • "What the cod team really needed was
        sequence data long enough to span
        these regions of heterozygosity and
        STRs. Their big break came in 2012
        when the Oslo center acquired the
        PacBio RS.
        Building a Better Reference
        As they tested out the new
        instrument, Nederbragt and his
        colleagues ran their default cod
        sample to get a sense of the PacBio
        performance with DNA they already
        knew very well. “When we looked
        at these PacBio reads mapping
        to the assembly, we saw them
        crossing large gaps of even multiple
        kilobases,” he says. It was a moment
        the team had been anticipating for
        years. “I could see that the problem
        of STRs and heterozygosity could
        be addressed by this technology,”
        Nederbragt adds.
        Indeed, the multi-kilobase reads
        from the PacBio RS confirmed what
        the team had suspected all along:
        that these short tandem repeats
        were preventing other sequencing
        technology from getting through
        gaps, and that the proliferation
        of these regions was causing a
        fragmented assembly. For the first
        time, the team actually had data
        indicating that its theory about the
        heterozygosity problem was correct:
        the reads showed long stretches
        of different sequence flanked by
        sequences that matched each other,
        indicating heterozygous regions. "

8.0663+1.6963(+26.63%)3:39 PMEDT