One of our favorite sessions at last week’s PAG meeting focused on a major sequencing effort to understand the coffee genome. The presentation, from scientists at Cenicafé (Colombia’s National Coffee Research Center) highlighted a new project designed to characterize elements of the coffee genome that might help breeders create strains better suited to a changing climate.
Coffee production has been hard hit in recent years: a coffee leaf rust epidemic in Latin America, for instance, has cost more than $1 billion. The plant is generally more susceptible to insects and diseases recently as a result of climate change, the scientists noted.
They teamed up with PacBio for long-read sequencing of the Coffea arabica cv. Caturra genome, an allotetraploid organism clocking in at about 1.3 Gb. We were delighted to see that they used our BluePippin automated DNA sizing platform to generate the longest possible PacBio reads. They produced about 60x coverage and built the first assembly of this genome.
The scientists plan to validate the assembly using a high-quality assembly of C. eugenioides, the diploid maternal ancestor of C. arabica. That genome assembly consists of sequence data from Roche 454 platforms as well as Illumina’s Moleculo technology.
The team is hopeful this work will make a difference in plant breeding to yield a hardier, healthier coffee plant. As Alvaro Gaitan and his colleagues wrote in their session abstract, the work “should dramatically improve our understanding of coffee genetics and genomics providing direct applications to breeders for climate change adaptation.”
This year’s PAG meeting featured the usual treats of interesting organisms being sequenced (koala!), reports from great plant and animal projects, and cool new technology approaches. While we found all of it fascinating, the theme of the meeting for us was large DNA. From the sessions we attended to the queries we heard most often from scientists visiting our booth, there was more interest than ever in sequencing with very long fragments of DNA.
We’ve been working with large DNA for several years now, and many of our customers use their BluePippins to prepare libraries of the largest possible fragments for sequencing on the PacBio platform. This technology pairing has yielded excellent results, many times even doubling the average read length generated by the sequencer.
Scientists were interested in a variety of other methods for using large DNA as well. One example is BioNano Genomics, which offers a genome mapping tool that allows researchers to explore structural variation and large genomic elements. There was also a lot of talk about 10X Genomics, which just announced a molecular barcoding product that can be used with short-read sequencers to view the long-range information that typically can’t be resolved by that data alone.
There were also some library prep tools geared toward analysis of large DNA fragments. Lucigen and Dovetail Genomics both offer prep kits for the generation of mate-pair libraries, and both show excellent results for increasing the contig N50 numbers that can be produced from typical sequencing workflows.
So why all this interest in large DNA? We believe that after years of generating genome assemblies with short-read sequencers alone, scientists have realized that they are not completely capturing important genomic elements, such as copy number variants, repetitive elements, and more. These new services and products all help scientists analyze genome biology more fully by using their existing data or by adding a new layer of information to help resolve these regions. It’s an exciting time for the field as we now have the ability to go back to draft genome assemblies and significantly improve their quality to benefit global communities of researchers.
San Diego, here we come! We’re getting ready for the annual International Plant and Animal Genome conference, a stellar event that highlights some of the latest and greatest work happening in the agbio realm. PAG is the one conference we attend every year where human research never makes an appearance. The spotlight is on the plants and animals, as well as the dedicated communities of researchers studying them.
This year we will once again be co-sponsoring a grant program for “The Most Interesting Genome in the World” with Pacific Biosciences. Scientists who submit a proposal for their favorite genome have the chance to get that organism sequenced on the PacBio long-read sequencing platform. Find out more at www.pacb.com/smrtgrant.
We’re pleased to see that so many PacBio customers have adopted our BluePippin automated DNA sizing platform to take full advantage of long-read sequencing. A recent paper published in the journal Scientific Data from a team of scientists presents PacBio sequence data for five organisms: Escherichia coli, Saccharomyces cerevisiae, Neurospora crassa, Arabidopsis thaliana, and Drosophila melanogaster. Libraries for each genome were size selected using BluePippin, and data was publicly released via NCBI’s Sequence Read Archive. We’re sure this effort will get noticed by PAG attendees!
If you’ll be at the meeting in San Diego next week, please stop by booth 228 to meet the Sage Science team and learn more about how automated DNA size selection can help improve your data. We’d love to say hello!
The holiday season always triggers some nostalgia, and here at Sage Science headquarters we’re thinking about what a big year 2014 has been for us. We hit some big milestones, including our 1,000th Pippin customer, moving to larger office space to fit our expanding operations, and our first foray into the proteomics market. We launched two major products this year, and we’re really excited about both: the SageELF, which performs whole-sample fractionation for DNA or proteins, and the PippinHT, a high-throughput version of our automated DNA sizing platform that can handle as many as 24 samples in a run.
It has been really gratifying to see just how many applications our clever customers are tackling with their Pippin instruments. We detailed several of these applications in a blog series on how people are pairing Pippin with Illumina NGS platforms, and we also got some great new app notes about using BluePippin with PacBio’s sequencer. We were proud to see the first customer poster highlighting work on the new SageELF.
We attended a lot of conferences this year, and the takeaway from all of them is that genomic studies are scaling more rapidly than even the most optimistic researchers might have predicted. There’s been tremendous growth in study size, notable expansion in the kinds of organisms being sequenced, and traction for genomic technologies beyond the traditional community into areas like histocompatibility typing and more. Based on the momentum, we are confident that next year holds even more awe-inspiring progress toward goals such as battling cancer, understanding the genetic basis of rare and common diseases, and influencing the microbiome to improve human health. You can look back at specific conference coverage for these meetings: ASHG, Beyond the Genome, ASMS, SFAF, ASM, ABRF, AGBT, and the PacBio user group meetings (spring and fall).
There were so many terrific publications from Sage customers this year. We’re impressed by all of them, but if you only have a few minutes, these are not to be missed:
• Evan Eichler’s effort to improve the human reference genome
• Proof-of-principle showing that genome editing can shorten a pathogenic repeat expansion into non-pathogenic range
• ABRF’s evaluation of RNA-seq platforms
• NHGRI’s analysis of antibiotic-resistant disease transmission at the NIH Clinical Center
Finally, we had the privilege of showcasing some Pippin customers and their terrific work this year. Check them out:
Thanks for a great 2014, and happy holidays from all of us at Sage!
If you’re performing isoform sequencing on the PacBio platform, check out this new protocol on DevNet. PacBio recommends size fractionation of cDNAs into four pooled fractions using our SageELF. There’s also an optional step in the protocol for larger libraries to use SageELF for the removal of shorter fragments prior to sequencing.
We’re glad to see the new protocol. Scientists are already doing impressive work with long PacBio reads to more accurately assess transcriptomes, and it’s great to know that our instrument can help people achieve insightful results.
To get a better sense of how the instruments function in a pipeline, check out this poster from researchers at the University of Washington and PacBio. It illustrates a gene expression study of various human cell types, yielding some transcripts longer than 10 Kb.