As 2016 draws to a close, we’re taking a look back to capture some of the highlights of the year before it disappears into the blur of previous years.
One of the most exciting advances this year came from the realm of cell-free DNA. Whether it’s for tracking signs of cancer or detecting tiny signals from a fetus, accessing these circulating DNA fragments is allowing scientists to make some real progress in clinical applications. Because of the rarity of fragments of interest amid all the other circulating DNA, size selection has proven to be an extremely useful tool for isolating the target fragments for analysis. We worked with Rubicon Genomics this year on a protocol to enrich for cell-free DNA and reported results in a poster at the AGBT Precision Health meeting.
We also enjoyed seeing the science community continue to build on restriction-site associated DNA sequencing methods. From the explosion of new tools in this area, it may well have been the year of RAD! From a method to expand RAD-seq utility to low-quality DNA such as that in museum samples to an optimized protocol for plants, the approach has been embraced in areas of broad interest. We recapped several of the new tools and methods for RAD-seq.
2016 was also marked by the release of new and better human genome sequences. As sequencing and analysis technologies become more affordable and accurate, groups around the world are aiming for reference-grade assemblies for their populations. These projects have been made possible in part by landmark efforts from the Genome in a Bottle Consortium to improve the quality and reliability of variants called from these genomes. Two of the most impressive human genome assemblies released this year were the Chinese and Korean genomes. Both used an array of sequencing and other technologies and quickly became some of the highest-quality human assemblies ever generated.
We enjoyed some milestones here at Sage Science too. Our newest platform, the PippinHT sizing platform, was cited in its first publication. Meanwhile, we continued development of our next tool, the SageHLS for generating high molecular weight libraries. We described it in this blog and released more details at ASHG. The official launch is imminent — stay tuned!
It was also an honor to see that NGS and related companies continued to make use of Sage products to optimize results. Here are some examples and recommendations we noticed this year:
Finally, we caught up with some customers to profile their great work. If you missed them, check them out now:
Hamid Ashrafi, North Carolina State University – blueberry breeding
Bruce Kingham, University of Delaware – genomics core facility
David Moraga Amador, University of Florida – ddRAD-seq and long-read sequencing
And now it’s on to 2017. From all of us here at Sage, we wish you a happy new year!
We’ve written before about the shift toward NGS-based technologies for the HLA typing market. HLA typing is used for everything from understanding autoimmune and infectious diseases to matching organ transplants to recipients.
But the HLA locus really make scientists and clinicians work for their answers. This region of the genome is one of the most polymorphic, with more than 14,000 recognized HLA alleles so far. The rise of affordable, high-throughput NGS platforms is an appealing alternative for labs responsible for typing these genes.
In a paper published in the December 2016 issue of Clinical Chemistry, scientists from The Children’s Hospital of Philadelphia describe an NGS-based HLA typing workflow that uses a kit from Omixon to report relevant class I alleles. We were pleased to see that our Pippin Prep system proved valuable in the pipeline; the team used it to select DNA fragments between 650 bp and 1300 bp prior to sequencing on an Illumina MiSeq.
“Generation of Full-Length Class I Human Leukocyte Antigen Gene Consensus Sequences for Novel Allele Characterization” comes from lead author Peter Clark, senior author Dimitri Monos, and collaborators. In it, they describe their evaluation of the Omixon Holotype HLA assay using samples from 50 individuals. “HLA genotyping results and fully phased consensus sequences were successfully generated for all 50 participants using the Omixon Twin algorithm (300 total alleles) and were found to be concordant with SBT/SSP genotyping results,” the authors report.
Interestingly, the team found that 7.7% of samples featured novel alleles and predicted that this discovery rate means “there are likely to be numerous yet undiscovered alleles of unknown significance.” They add that “full-length gene characterization is paramount for unambiguous HLA genotyping and facilitates a deeper understanding of HLA gene polymorphisms and the eventual role they may play in the immune response.”
Mendelspod’s recent interview with Marco Marra of the BC Cancer Agency and the University of British Columbia is well worth a listen. In the podcast, Marra describes his team’s use of genome and transcriptome sequencing for patients whose cancer is considered incurable.
Marra first captured attention in this area in 2009 when he reported his lab’s use of whole genome sequencing to inform treatment decisions for a patient with a rare adenocarcinoma. Genome and transcriptome analysis revealed that the tumor was driven by the RET oncogene. The patient, for whom there had been no clear therapy option, was treated with a RET inhibitor that was in clinical trials at the time — and the tumor shrank significantly.
Since then, Marra parlayed that individual project into a pilot study for how whole genome sequencing could be expanded to other cancer patients. That study was broadened again in 2014 and today his team has analyzed some 400 people who essentially have no other options for treatment. The scientists look for all sorts of mutation types, from SNPs to structural variants and more. One major challenge has been off-label use of drugs: in many cases, genome analysis points to a therapy that’s not indicated for the patient’s type of cancer, and gaining access to the therapy is hit or miss. As the cancer genomics program has expanded, Marra wrestles with questions like, “What is the meaning of having whole genome analysis that points you to a particular agent that you can’t get?” As he told interviewer Theral Timpson, “These are deep conversations that are happening within our environment and probably elsewhere.”
Marra is also keeping a close eye on how clinicians apply information from the genomic analysis. Doctors who just get a report of mutations tend to be less comfortable incorporating that data into treatment decisions. But a weekly conference that allows physicians, scientists, bioinformaticians, and pathologists to walk through case studies often prompts useful interdisciplinary discussions and frequently leads to increased implementation of genomic results, he said.
If you’ve got a little time, we highly recommend listening!
We normally wait until papers come out in scientific journals before reporting on them here, but there are so many great preprints featuring Sage Science tools we couldn’t resist pointing them out. (On a side note, the rising numbers of biology-focused papers posted as preprints is a terrific trend. We’re thrilled to see the peer-review process becoming more transparent and results getting out to the community faster.)
Here are quick recaps of several preprints, all available through bioRxiv.
Erwin Datema, Raymond J.M. Hulzink, Lisanne Blommers, Jose Espejo Valle-Inclan, Nathalie Van Orsouw, Alexander H.J. Wittenberg, Martin De Vos
Posted: November 1, 2016
This paper from Keygene scientists used Oxford Nanopore sequencing technology to analyze the fungal pathogen Rhizoctonia solani, generating a highly contiguous 54 Mb assembly. The team focused on optimizing methods for handling high molecular weight DNA to produce the longest sequencing reads. They used BluePippin’s high-pass mode to remove smaller DNA fragments. According to the paper, this approach allows the lab to generate a low-cost eukaryotic fungal genome within a week.
Conrad P.D.T. Gillett, Andrew J Johnson, Iain Barr, Jiri Hulcr
Posted: September 12, 2016
In this preprint, scientists from the University of Florida and the University of East Anglia evaluated a sequencing-based approach to monitoring biodiversity in a region using dung beetles. Since these beetles regularly consume vertebrate dung, the contents of their intestines can reveal quite a bit about animals in the area. They sequenced samples from 10 species of dung beetles collected from a savanna region in southern Africa, and then compared the mitochondrial DNA results against public databases. Results matched animals expected in the area, such as zebra, cattle, goat, and wildebeest. DNA libraries were size-selected using the SageELF system followed by sequencing on an Illumina NextSeq.
Fabio Zanini, Johanna Brodin, Jan Albert, Richard Neher
Posted: September 25, 2016
Researchers at Stanford University, the Karolinska Institute, and the Max Planck Institute collaborated in this effort to establish more accurate and reliable methods for deep sequencing of viral genomes without the amplification biases and sequencing errors that often occur. In a study focused on sequencing populations of HIV-1, the team adjusted the standard sequencing workflow to reduce artifacts and errors. One of those changes involved replacing bead-based size selection with BluePippin sizing, which yielded a more uniform size distribution to meet the insert size needed by the MiSeq platform. With this approach, the scientists were able to detect rare mutations down to 0.2% and to avoid PCR recombination.
Gao Shan, Xiaoxuan Tian, Yu Sun, Zhenfeng Wu, Zhi Cheng, Pengzhi Dong, Bingjun He, Jishou Ruan, Wenjun Bu
Posted: October 6, 2016
This work from scientists at Nankai University and Tianjin University of Traditional Chinese Medicine focuses on mitochondrial biology. They used the Iso-Seq method to generate “the first full-length human mitochondrial transcriptome from the MCF7 cell line based on the PacBio platform.” As part of the study, the team used transcriptome data publicly released by PacBio, for which size selection was performed on a SageELF to create six binned libraries.
The Oxford Nanopore team has been speaking recently about their use of our BluePippin automated size selection system for optimizing the read length obtained from nanopore sequencers. For anyone interested in the Oxford platforms who hasn’t seen this information, here’s a quick recap.
As we’ve seen with PacBio, the other long-read platform, single-molecule sequencers tend to produce reads as long as the fragments fed to them. Naturally, users interested in maximizing the read lengths of these systems want to feed them only the longest possible fragments. The simplest and most effective way to do that is what we call high-pass sizing, or selecting all DNA fragments longer than a certain size threshold during the sample prep process.
For the MinIon and PromethIon sequencers from Oxford Nanopore, the company recommends BluePippin sizing for various protocols. This library prep workflow for both sequencing systems uses BluePippin to eliminate shorter fragments; one example of outcomes shows a whopping 255 Kb read from an E. coli experiment. There’s a similar rationale for recommending BluePippin for de novo whole genome assembly with the MinIon system. And this protocol demonstrates how automated sizing fits into a sequence-capture approach for library prep prior to nanopore sequencing.
We’re delighted that BluePippin is showing such utility for nanopore sequencing. If you’re an Oxford Nanopore customer who doesn’t already have access to one of these instruments, contact us to learn how BluePippin can make a difference in your pipeline.