A new podcast from Mendelspod features an interesting interview with Barrett Bready, CEO of electronic mapping firm Nabsys, who emphasizes the growing need to incorporate structural variation data into genome studies.
In the discussion, Bready describes his company’s platform, which relies on voltage-powered, solid-state nanodetectors to generate map-level information. Each nanodetector can cover 1 million bases per second, Bready said, and can be multiplexed for a highly scalable system. It’s a “really high-speed, highly scalable way of getting structural information,” he added.
Bready noted that the genomics community has realized the need for long-range information, estimating that known structural variants now make up about 60 Mb of the human genome, a number that has increased rapidly in the last few years even as the amount of sequence attributed to single-nucleotide variants has stayed the same. Nabsys aims to democratize access to structural information by producing a cost-effective mapping tool for routine analysis of these large variants.
This information will complement short-read data, Bready said, which necessarily sacrifices assembly contiguity due to the need to cut DNA into small fragments prior to sequencing. The Nabsys platform works with high molecular weight DNA to capture extremely long-range information. He also said that electronic mapping data offers more value than optical mapping technologies.
Beta testing for the new platform is expected to begin early next year.
A new preprint from the Hoekstra lab at Harvard makes great use of the double digest RAD-seq protocol to better understand reproductive barriers and speciation in closely related species of mice. Since it was the Hoekstra lab that gave us the ddRAD-seq method, we took notice when this preprint became available.
The paper comes from Hopi Hoekstra and Emily Delaney, a Harvard grad student who is now a postdoctoral fellow at the University of California, Davis. In “Sexual imprinting and speciation in two Peromyscus species,” the scientists describe how sexual imprinting, typically a learned trait, contributes to sexual isolation of Peromyscus leucopus, the white-footed mouse, and P. gossypinus, the cotton mouse.
One area of interest at the start of this project was determining the genetic or learned mechanisms underlying sexual isolation. The scientists “used genomic data to first assess hybridization in the wild and conclusively found that the two species remain genetically distinct in sympatry despite rare hybridization events,” they report. “We find that these mating preferences are learned in one species but may be genetic in the other: P. gossypinus sexually imprints on its parents, but innate biases or social learning affects mating preferences in P. leucopus.”
The study involved using ddRAD-seq to analyze 376 mice. In that workflow, the team used Pippin Prep to select fragments ranging from 265 bp to 335 bp. Libraries were sequenced with the Illumina platform.
“Our study supports an emerging view that sexual imprinting could be vital to the generation and maintenance of sexual reproductive barriers,” the authors conclude. “Examining the role of sexual imprinting in similar cases of speciation driven by sexual reproductive barriers will continue to expand our understanding of the role of behavior in speciation.”
At the Broad Institute, scientist Michelle Cipicchio is part of the technology development team responsible for optimizing new methods or sample types before they’re implemented on the organization’s industrial-scale exome and whole-genome sequencing pipeline. Recently, she’s been working with the Chromium platform from 10x Genomics, and part of getting it ready for production involved implementing the PippinHT for automated DNA size selection.
The technology development team is focusing on whole genome analysis with the Chromium platform. To put the workflow through its paces, they’re running a pilot project on 450 whole blood samples for scientists conducting a large schizophrenia study.
Cipicchio began working with automated DNA size selection from Sage Science at the recommendation of 10x Genomics. “The first step in the 10x process requires the longest DNA molecules that you can acquire,” she says. Since the Broad often uses legacy samples that have gone through multiple freeze/thaw cycles, her team doesn’t have the luxury of expecting high-quality, intact DNA. “For 10x, these long molecules are really necessary and most of our samples don’t have a ton of that kind of material,” Cipicchio adds. She began using BluePippin to remove smaller fragments prior to library construction. The team evaluated four samples with and without Pippin size selection and found that they were consistently able to get longer phasing data with automated size selection. To ramp up capacity so all 450 samples can be run with size selection prior to Chromium processing, the team upgraded to the higher-throughput PippinHT platform.
Optimization work for the workflow is still underway. Cipicchio and the team have run about 100 of the 450 samples so far, so they have lots more opportunities to polish and perfect the protocol before it’s ready for production mode.
The Sage Science team was delighted to attend and co-sponsor PacBio’s annual East Coast user group meeting in Baltimore last week, particularly since there was a half-day session devoted to our favorite subject: sample prep.
There were plenty of customer presentations during the sample prep workshop, and it was great to see so many PacBio users deploying BluePippin, PippinHT, or SageELF in their sequencing workflows. Melissa Laird Smith from the Icahn School of Medicine at Mount Sinai may have put it best when she told attendees that the two most important components for PacBio sample prep are upfront quality control and size selection. The QC step, of course, evaluates sample quality and quantity to ensure that long-read sequencing is viable. Size selection allows users to really make use of their PacBio platforms by eliminating shorter fragments and letting the sequencer focus on the longest fragments available. Those are often used as seed reads to anchor assemblies, making them critical for achieving optimal contiguity. Smith said her team uses BluePippin or PippinHT to select either 10 kb – 50 kb or 20 kb – 50 kb ranges, depending on the sample.
Sonny Mark, a field application scientist manager at PacBio, also took the opportunity to introduce attendees to the SageHLS extraction and purification instrument we launched earlier this year. Designed expressly for the kind of high molecular weight DNA that single-molecule systems require, the SageHLS platform should be a nice fit for long-read sequencing pipelines. Users simply load their samples (up to four at a time) and the instrument extracts or purifies DNA fragments as long as 2 Mb. The fragments are automatically sorted by size into six collection bins. We anticipate that this product will work well for scientists studying structural rearrangements, copy number variation, haplotype phasing, and other applications for which HMW DNA is advantageous.
During the rest of the user group meeting, we thoroughly enjoyed learning about so many impressive results users have generated with their PacBio systems, from reference-grade genome assemblies to in-depth annotations. Congratulations to everyone who contributed!
A recently shared preprint demonstrates the effectiveness of size-selection for nanopore sequencing, relying on the PippinHT automated DNA sizing platform for high-throughput pipelines.
“Mapping And Phasing Of Structural Variation In Patient Genomes Using Nanopore Sequencing” comes from lead author Mircea Cretu Stancu and collaborators at University Medical Center Utrecht, the University of Torino, and other institutions. In it, the scientists report results from using an Oxford Nanopore MinION to sequence the genomes of two patients with congenital abnormalities, with a focus on structural variant (SV) detection. “Long-read sequencing is breaking ground for the discovery of SVs at an unprecedented scale and depth,” they write. The team used the PippinHT system to size-select DNA libraries for the second patient prior to sequencing.
The effort, which produced the first known whole human diploid genome assemblies using the MinION, was a success. “We were able to extract all known de novo breakpoint junctions for Patient1, even at relatively low coverage,” the scientists report. For the second patient, the sequence data revealed more complexity for many breakpoint junctions. “We observed that 33.3% of the high confidence set of SVs observed in the Nanopore data could not be found in matching Illumina sequencing data, despite the use of six different variant calling methods,” they add.
The authors note that “these results highlight the feasibility to sequence clinical human samples in real-time on a low-cost device.”