At Creighton University in Omaha, Neb., Dr. Anna Selmecki’s lab explores various fungal species to understand genome instability, pathogenesis, and the acquisition of drug resistance. For these investigations, her team relies heavily on whole genome sequencing, using both the Illumina MiSeq platform and Oxford Nanopore sequencers.
However, Selmecki and her team encountered two major obstacles with their library preparation pipeline. A bead-based size-selection step was decreasing their yield and even with size selection, the MiSeq was still generating very short reads. Using AMPure magnetic beads for sizing, “we always found that we lost a huge percentage of the library,” Selmecki recalls. Even when a Bioanalyzer reported that the library fragment size was in the desired range, sequencing results were consistently shorter than expected.
While both problems stemmed from the sizing step, switching to commonly used manual gel excision was not an option. “From previous experience, I knew that cutting bands out of a gel is horrible and you still lose a lot of your library that way,” Selmecki says. She remembered from her days at the Dana-Farber Cancer Institute that colleagues had raved about an automated size selection instrument from Sage Science.
So Selmecki brought in the BluePippin sizing platform and solved both problems. Recovery is significantly better, and more precise size selection removes the small fragments that had been leading to shorter-than-anticipated MiSeq reads. “The Pippin cleaned that up a lot, ensuring that we’re only amplifying pieces that are much larger,” she says. Using BluePippin for size selection followed by bead-based purification, Selmecki and her team can easily select for insert sizes of 600 bp to 1.2 Kb for their paired-end sequencing pipeline. “We found we got better coverage across the genome,” she adds.
Selmecki’s team is already planning to expand the use of its BluePippin instrument to other molecular biology techniques, such as molecular cloning and library preparations for Oxford Nanopore sequencing. “We’re just doing everything on the Pippin,” she says.
“If people are noticing really uneven coverage across their genomes or they’re having trouble with yield during their library prep, I would recommend considering the Pippin,” Selmecki says.
If you haven’t heard about CATCH by now, it’s time to catch up. Short for Cas9-assisted targeting of chromosome segments, CATCH comes from the lab of Yuval Ebenstein at Tel Aviv University and was first reported in this Nature Communications paper.
Like so many scientists, Ebenstein found himself routinely having to sequence whole-genome data in order to study a region that was too large to amplify easily with PCR. “You end up paying for all this data and eventually using a very small fraction of it,” he recalls. While there are several target-capture and enrichment methods, they all require knowledge of the sequence of interest. But for Ebenstein, who was interested in highly repetitive DNA, those methods didn’t work.
He cast about for a new approach, and found inspiration in the burgeoning CRISPR field. “We came up with this idea that you can cut the flanking region with Cas9 and then use gel electrophoresis to extract only the fragment that you’re looking for,” he says. The method involves RNA-guided Cas9 to make two cuts to pull out the specific region of interest, followed by a size-separation step to remove off-target fragments. It’s geared toward genomic regions that are 50 Kb or larger. Together with Ting Zhu and Chunbo Lou from Tsinghua University, the team began generating custom BACs by combining CATCH with the Gibson assembly to cut the desired piece of DNA and clone it into a vector in a streamlined process.
Since then, Ebenstein and many other labs using CATCH have been broadening the base of applications. It’s particularly attractive for third-gen sequencing platforms; because they typically have lower throughput, “it’s especially beneficial to only probe what you’re interested in and not waste your sequencing depth on regions that are not of interest,” he says. “This is the power of CATCH: no matter how complex the region or what structural variations are in it, if you know the flanking region, you can fish it out and analyze it.”
An early drawback with the CATCH protocol was its use of gel electrophoresis, which Ebenstein refers to as “a prehistoric technology.” Size selection is essential for the method, but users must perform the very cumbersome pulsed-field gel electrophoresis technique. That’s where the SageHLS instrument came in. “Sage basically eliminates all of that,” Ebenstein says. The automated platform handles everything inside the gel, and collects size fractions without needing a visible band. “The recovery is phenomenal,” he adds. “You can use a very low amount of starting material and you still get a meaningful amount of DNA for further analysis.”
The protocol for using the SageHLS instrument with CATCH (something we refer to as HLS-CATCH) is still undergoing optimization, with Ebenstein’s team putting the new platform through its paces.
In the meantime, the community continues to push ahead with CATCH. It is already in development in several labs for studies of plants, which have highly repetitive DNA. Ebenstein and others are working to make the protocol robust for use in human genetics as well, targeting important genes such as BRCA1 and BRCA2. He says that the SageHLS instrument will likely be an important factor in those efforts.
How can you tell if CATCH is right for you? Ebenstein has a simple rule: “If you can PCR it, PCR it,” he says. “If you can’t, then you probably need CATCH if you don’t want to go bankrupt.”
Today is DNA Day, and we’re taking the opportunity to support the humane treatment of DNA. After all these years of harshly shearing these molecules and fragmenting them down to just a few hundred bases, can’t we agree that there are nicer ways to treat them? (Yeah, we know that for some applications, you really do need teeny tiny pieces of DNA. We get it.)
For many applications — particularly long-read sequencing and long-range technologies such as optical mapping — it’s actually better to leave DNA as intact as possible. Just a little gentle cleaving, and you wind up with extremely long DNA fragments that produce optimal results. By preserving these molecules as much as possible, we can detect large structural variants, phase distant SNPs, accurately count copy numbers, and much more.
Large input DNA is responsible for major advances in genomics, such as the most contiguous assemblies yet for humans and other mammals. These reference-grade assemblies have been tremendously useful for filling in blanks left by previous sequencing attempts using short reads, allowing scientists to discover new genomic elements — including entire genes — that had been missed with other approaches.
High molecular weight libraries are also being used for newer interrogations of the regions of DNA that touch when the molecule is folded in the nucleus. After decades of only studying DNA in linear order, we’re getting amazing new insights from approaches like proximity ligation mapping. Discoveries like this tell us that DNA is probably harboring even more fascinating secrets, and we just need to find the right ways of asking questions.
And perhaps it all begins with better treatment of your DNA molecules! We hope you’re celebrating DNA Day today. From all of us at Sage, happy HMW DNA to you!
Last week’s annual meeting of the American Association for Cancer Research offered some great perspectives on innovation in oncology, both in the clinic and in academic labs.
We were particularly impressed by former Vice President Joe Biden’s update on the Cancer Moonshot initiative, which he characterized as a bright spot in bringing people together and pushing research forward. For example, Amazon offered to host some extremely large cancer databases, and in less than a year the information has been accessed 80 million times. Biden’s optimism about the effort turned to frustration with the new administration’s proposed cuts to research funding. He said the “Draconian cuts” would be a massive setback, though he expressed doubt that the proposed budget would pass Congress.
The association also gave out some prestigious awards, such as the AACR Award for Lifetime Achievement in Cancer Research to Mina Bissell. The Lawrence Berkeley National Laboratory scientist has been a pioneer in breast cancer research, delivering some of the earliest findings that cells lose their native expression patterns when cultured in different conditions (she also discovered that cells can “remember” their original profile when native microenvironment conditions are restored). As strong supporters of improved sample prep, we see Bissell as a champion for the kind of reproducible research practices that are essential to life science.
One of the most exciting technical advances came from a team at Johns Hopkins University, where scientists developed an error-correction method for NGS results from liquid biopsies targeting cell-free DNA. The approach boosts accuracy with ultra high-coverage sequencing. We’re excited about this work because it dovetails nicely with our new SageHLS instrument for purification of extremely large DNA molecules, such as entire genes associated with cancer. In beta tests, scientists have successfully purified the BRCA1 and BRCA2 genes using the platform with the CATCH method (Cas9-assisted targeting of chromosome segments).
We congratulate all the scientists who presented at AACR and made such a strong showing for the terrific recent advances in cancer research!
Chris Boles is Chief Scientific Officer of Sage Science, where he’s been helping the R&D team develop the new SageHLS (that’s short for HMW Library System), a platform designed to rapidly purify high molecular weight DNA directly from samples. We caught up with him to learn more about it.
Q: What’s so important about having high molecular weight DNA?
A: Working with extremely long DNA has become a lost art in the life science community. Back in the early days of the Human Genome Project, every lab had to think about this as they worked with recombinant BACs, fosmids, Southern blots, and so on. But beginning with PCR in the early ’90s and continuing with short-read NGS since the early 2000s, life scientists have had the tools to do amazing things without the need for HMW DNA. Now, researchers are tackling repetitive genomic regions, long-range structural variation, and long-range phasing, and the need for high-quality, high molecular weight DNA has resurfaced.
Q: What are some of the long-range technologies that the SageHLS platform could be used with?
A: Really any system that requires DNA that is hundreds of kilobases to megabases in size. These include long-read sequencing platforms like PacBio or Oxford Nanopore, optical mapping technologies such as those from Bionano Genomics or Genomic Vision, and other long-range linkage analysis methods like the ones from 10x Genomics or Dovetail Genomics. The SageHLS can improve input DNA quality and size for all of these systems.
Q: What kind of sample prep is required before loading the SageHLS?
A: In general, very little. We have focused initially on sample types that 1) are important for biomedical research, and 2) work well in SageHLS. These include white blood cells from whole blood, tissue culture cells, or bacterial cultures. For these sample types, only a few brief centrifugation steps are necessary to wash the cells and resuspend them in an isotonic gel loading buffer. The cassette reagents do the hard work of lysis and purification without mixing or shearing the HMW DNA.
Q: How does the new system work?
A: Users load their samples into a gel. Then the platform automatically performs cell lysis and contaminant removal. This happens very quickly, leaving megabase-sized DNA stuck in the agarose. Next, the DNA is lightly cleaved with a non-specific nuclease and retrieved from the gel through an automated elution process.
Q: What kind of results have you gotten from the SageHLS internally?
A: From mammalian WBC and tissue culture cells, we routinely obtain DNA ranging in size from 200 kilobases to 2 megabases. From input cell loads containing about 10 ug of DNA (about 1.5 million human cells), we recover 1 to 3 ug of DNA of this size, which is sufficient for even DNA-hungry applications like optical mapping.
Q: You’re already working on new uses for the SageHLS platform. Can you give us a sneak peek?
A: There are several improvements that should happen fairly quickly after launch. Next up is a process we call HLS-CATCH, which uses CRISPR/Cas9 technology to excise and isolate a genomic fragment of interest in a targeted fashion. We’re also working on several methods for making NGS libraries directly in the HLS cassettes so that we can integrate DNA extraction directly with NGS library construction. It will be interesting to learn from customers what else they want to do with the system.
Check out the SageHLS product page to learn more about the platform.