As we start to make lists of New Year’s resolutions (with bets on how long they’ll last), it’s the perfect time for a moment to absorb the themes and highlights of 2017. In our corner of the world, that means impressive advances around HMW DNA and long-read sequencing; novel biological insights about infectious disease, cancer detection, and more; plus new and improved sample prep methods for DNA sequencing.
For us, one of the breakthroughs of the year came in the form of CATCH, or Cas9-assisted targeting of chromosome segments. This method from Yuval Ebenstein and collaborators allows users to target large or complex regions of the genome for cost-effective sequencing. The innovation is based on CRISPR, taking advantage of the precise activity of Cas9 guide enzymes to snip out the region of interest. The original method, which relied on gel electrophoresis, was improved by swapping in the SageHLS instrument for a more streamlined and automated process with excellent recovery.
If you polled the Sage team about the best part of our jobs, it would be unanimous: getting to know our customers! This year, we had the honor of profiling great work from Anna Selmecki at Creighton University, who is using BluePippin to boost library recovery for investigations into genome instability in fungi, and the Broad Institute’s Michelle Cipicchio, who helps optimize methods before they are put into production. She has been using the PippinHT platform to get the best results from the 10x Genomics Chromium system.
Of course, we also spend plenty of time keeping up with the literature — especially the growing number of preprints. One of our favorite studies this year came from scientists in Brazil who reconstructed the transmission path of a recent chikungunya outbreak in their country. The team’s budget was tight, but their results show just how much can be accomplished with creativity and a little help. We were also particularly impressed by a preprint from UK scientists who demonstrated that size selection can significantly improve results from circulating tumor DNA studies, with implications for liquid biopsies in general. And the field of long-read sequencing continued to heat up, with lots of advances including this great comparison of PacBio and Oxford platforms for transcriptome analysis.
As always, we enjoyed hearing from luminaries in the genomics field through Mendelspod interviews this year. If you missed the podcasts with Mark Akeson, Deanna Church, Yuval Ebenstein, or Evan Eichler, we recommend carving out some time to listen.
From all of us at Sage Science, we wish you and yours a healthy and happy holiday season.
We always keep our eyes peeled for interesting new research from scientists using Sage Science automated DNA size selection instruments, and several recent preprints caught our attention. Here’s a look:
Authors: Liang Gong, Chee-Hong Wong, Wei-Chung Cheng, et al.
Scientists from The Jackson Laboratory for Genomic Medicine and China Medical University in Taiwan teamed up to detect structural variants in breast cancer genomes using a custom-built pipeline called Picky. They chose nanopore sequencing to generate long reads, identifying SVs with excellent sensitivity and specificity and finding that repetitive DNA was the primary source of cancer-related variation. This approach could prove useful in efforts to assess genome stability in a tumor over time. The team used BluePippin to size-select 12 Kb libraries prior to nanopore sequencing.
Authors: Jonas Korlach, Gregory Gedman, Sarah Kingan, et al.
In this work, scientists seeking to improve upon short-read genome assemblies for two birds deployed long-read PacBio sequencing to generate new diploid assemblies. The effort yielded assemblies with megabase-sized contigs, with a 150-fold improvement in the contiguity for the zebra finch genome and 200-fold improvement for Anna’s hummingbird. Since the birds are both models for vocal learning, the higher level of completeness, correction of previous misassemblies, and more accurate gene sequences will be important for many future studies. The team used BluePippin to size libraries for zebra finch and hummingbird prior to sequencing.
Authors: Devang Mehta, Matthias Hirsch-Hoffmann, Andrea Patrignani, et al.
Scientists developed a new method for deeply sequencing viruses that can accurately represent populations with high levels of homology across genomes. They incorporated long-read sequencing with random circular amplification enrichment and a novel de-concatenation protocol, validating their results in a large population of geminiviruses. BluePippin was used for size selection prior to the enrichment step and again during library preparation for sequencing.
Authors: Derrick Thrasher, Bronwyn Butcher, Leonardo Campagna, et al.
Continuing with the avian theme, researchers used ddRAD-seq to analyze as many as 600 SNPs from up to 240 members of a population — validated in this case with a study of Malurus lamberti and other bird species. By comparing results to microsatellite markers, they determined that the ddRAD-seq method “results in substantially improved power to discriminate among potential relatives and considerably more precise estimates of relatedness coefficients,” they report. The pipeline they present, which relies on BluePippin for DNA fragment sizing, can be used with any other bird species, and other organisms as well.
If you’ve ever relied on the human reference genome, don’t miss this podcast with assembly pioneer Deanna Church. Mendelspod’s Theral Timpson interviews the genome informatics expert who made her name as an integral part of the reference project at the National Center for Biotechnology Information. Today, she’s Senior Director of Applications at 10x Genomics, where she’s working on everything from haplotyping to single-cell genomics.
In the podcast, Church offers insight into various efforts to improve the quality of the human reference genome, as well as a look at robust new work to characterize structural variation. She talks about the importance of phasing for structural variant detection, which explains why her NCBI team was so adamant about moving toward a haplotype-aware genome assembly instead of using “averaged-out alleles,” she says. Short reads can be especially problematic for this use because they can’t always clearly distinguish between two alleles of a heterozygous variant. Using problematic alignments could lead to a confounded analysis, she adds, “because you’re mixing the reads from those two genotypes.”
Church also calls for better integration of variant findings. “As a community we’ve had a very individual variant-centric view of genome analysis,” she tells Timpson, contending that viewing variants and their interactions with each other more holistically would provide much-needed information for genome interpretation efforts. She notes that combining technologies, an approach showcased in a recent preprint she co-authored with the Human Genome Structural Variation Consortium, is essential for a holistic approach. To that end, linked-read technology like 10x’s is a great complement to other methods. Church says linked reads enable de novo assembly and haplotype reconstruction at scale; customers have already published impressive demonstrations of this type of work.
Sample prep came up in the discussion as well. “You definitely want to try to optimize for longer molecules,” Church says about 10x technology, noting that recommended protocols are in place and under development for a range of sample types. (We’re pleased to be included in 10x protocol recommendations.)
Church also spoke about single-cell genomics, an area she is eager to explore. “Single cell is obviously one of the most exciting ways to think about doing science these days because it just allows us to get this level of resolution that’s not accessible with bulk,” she says, suggesting that this approach will be especially useful for understanding developmental biology. In some ways, she adds, the state of single-cell genomics reminds her of the early days of the Human Genome Project: there’s widely recognized potential, but the path forward isn’t completely clear yet.
It’s a great discussion, and we hope you have time to listen!
We had a blast as ASHG last week, and wanted to thank all the attendees who stopped by our booth. We were delighted to meet you all!
If you couldn’t make it to ASHG, a couple of running themes dominated the sessions and conversations: mega-scale studies, and the evolution of studying variants more complex than SNPs.
It wasn’t so long ago that a 100-person study would have led to an impressive talk at ASHG. But this year, speakers routinely cited studies with tens of thousands, or even hundreds of thousands, of participants. From the Million Veteran Program to the Estonian Biobank, these programs are adding so much to genetic databases that scientists are finally getting a handle on complex hereditary traits such as height. Amid these studies, though, was a continuing push to better represent more ethnic groups to achieve real diversity in publicly available databases. We wholeheartedly support those efforts. Without breaking the barriers of underrepresented groups, we will never achieve precision medicine for everyone.
Another shift came from variant discovery and analysis. More and more, scientists are pushing past SNPs to focus on larger structural variants. The community’s initial focus on SNPs was guided by technology — we could spot single variant changes, so that’s what we looked for — but with improvements to sequencing and other analysis tools using high molecular weight DNA, it is now possible to detect structural variants more comprehensively and reliably. These variants have already been demonstrated to cause diseases, and it was evident at ASHG that finding and cataloging them is a major priority for the genetics field to better understand genome function.
Thanks again for catching up with us in Orlando, and we’re already looking forward to ASHG 2018 in San Diego!