The genomics core facility at the University of Delaware has set itself apart from other service providers by being among the first to adopt new sequencing technologies. The strategy has been a success: today, the facility serves customers around the world, hailing from research and nonprofit institutes, federal agencies, and even foreign governments. While projects range from microbial to human and everything in between, agrigenomic studies are especially popular for users looking to improve growth and disease resistance among crops and livestock.
Bruce Kingham, who runs the genomics core lab, has also focused on adopting state-of-the-art tools to keep the sequencers running happily. Size selection has been essential for delivering optimal results to his user base, from their first Illumina NGS platform in 2007 to the PacBio single-molecule sequencing system. It was the acquisition of the Illumina GA that spurred his team to offer library prep as a service, for which they invested in the Pippin Prep for automated DNA size selection. “That allowed us to not only get a very focused size for the libraries that we were preparing, but more importantly it allowed us to start with a much smaller quantity of DNA,” he says. Prior techniques relied on inefficient fragmentation procedures and gel extraction to isolate the desired fragment size, resulting in a great deal of undesirable sample loss.
Today, Pippin sizing — now with BluePippin — continues to be important for Kingham’s Illumina workflow, including PCR-free projects. “Size selection has been critical because the PCR-free library preparation process can be prone to generating libraries that have a broader size range,” he says. “Illumina technology for a number of reasons does not like libraries that are broad in size.” From clustering efficiency to optical analysis, these sequencers perform best when fed libraries with tightly sized DNA fragments. For Illumina sequencing in general, Kingham says, “downstream analysis, including mapping or de novo assembly, is going to be more efficient and have more statistical significance if the size range of individual libraries is focused.”
For PacBio sequencing, Kingham’s team uses both BluePippin and SageELF for size selection. Because the BluePippin is so useful for eliminating small fragments and keeping the PacBio platform focused on generating the longest reads possible, it dramatically improves the quality of results. “With the volume of sequencing that we do, the BluePippin paid for itself in a couple of months,” Kingham says. By increasing average read length and N50 read length, BluePippin “lowers the cost of the data that needs to be generated to achieve a certain sequencing goal, such as the lowest number of contigs,” he adds. The lab uses SageELF for Iso-Seq protocols, where it significantly reduces the amount of input DNA required.
Looking ahead, Kingham sees increased demand from scientists for pairing genomics and proteomics data. It’s a trend that fits nicely at his home institute, which has a mission of promoting interdisciplinary research. To that end, his team has already begun evaluating the SageELF for use in protein fractionation. “That could be a welcome service, and I’m always looking for new services to provide,” Kingham says. “I want to see my instruments running as much as possible.”
Scientists at the University of Oregon have published a new method to detect PCR and sequencing errors that should help other researchers track rare SNPs with greater accuracy. PELE-seq, which gets our vote for best new protocol name, can be used with ddRAD-seq, targeted amplicon sequencing, and many other genotyping methods.
From lead author Jessica Preston and senior author Eric Johnson, “High-specificity detection of rare alleles with Paired-End Low Error Sequencing (PELE-Seq)” came out in BMC Genomics. The scientists embarked on this project to reduce the current error rate in NGS studies, which they peg at about 1% and say “leads to the generation of millions of sequencing errors in a single experiment.”
The team uses barcoded adapters as well as overlapping paired-end reads on size-selected DNA molecules to maximize accuracy. The barcoding process reduces false-positive SNP calls, while the overlapping reads reduce sequencing errors. The team used our Pippin Prep automated DNA sizing platform to collect tight DNA bands prior to paired-end
sequencing on Illumina. Scientists tested the PELE-seq protocol on E. coli and Caenorhabditis remanei, finding improved specificity and sensitivity for accurately detecting rare variants.
“We have demonstrated that the PELE-Seq method of variant calling is highly specific at detecting rare SNPs found at below 1% in a population,” the scientists write. “There were zero instances of false positive SNPs called from PELE-sequenced control E. coli libraries containing rare alleles present at known frequencies, whereas standard NGS DNA-Seq libraries contained 30–50% false-positive SNPs.”
Is it really possible to detect somatic structural variants accurately from a single sequencing read? A new protocol from scientists at the Albert Einstein College of Medicine in New York and Voronezh State University in Russia was designed to do just that.
In the Nature Methods paper entitled “Quantitative detection of low-abundance somatic structural variants in normal cells by high-throughput sequencing,” lead author Wilber Quispe-Tintaya, senior author Alexander Maslov, and collaborators describe a method called Structural Variant Search (SVS).
“The key feature of SVS is its ability to definitively call [a structural variant] using a single sequencing read that spans the breakpoint, without the need for multiple supporting reads,” the scientists report. The workflow relies on preparing a chimera-free library and on a new algorithm that calls structural variants without using consensus data. The variant caller uses a split-read method for identifying potential structural variants, filters out artifacts, and then separates somatic from germline variants.
They demonstrate the workflow on a cell line known to harbor integration events from human papillomavirus. SVS called 20 integration sites; 17 had previously been reported, and two of the three novel findings were confirmed by PCR testing. “Most likely these two HPV integration sites had not been detected previously because of their low abundance, underscoring the unique capability of SVS to detect low-frequency [structural variants],” the authors note.
The team’s library prep procedure included size-selection on a PippinHT instrument, after which the samples were sequenced using the Ion Torrent Proton platform.
We’re pleased to report that 10x Genomics has released a new sample prep protocol for its Chromium platform that includes the BluePippin and PippinHT size selection platforms from Sage Science.
10x Genomics has gotten a lot of attention in recent years for its impressive ability to generate long-range information from short-read sequencing data, filling a major need in the scientific community. Scientists with Illumina pipelines can easily add the 10x instrument to generate another dimension of data that’s especially useful for alignment and assembly.
The 10x instrument delivers best results when it’s working from high molecular weight DNA (at least 50 Kb), with benefits including longer haplotype blocks and enhanced ability to call structural variants. In the new protocol, 10x recommends using the BluePippin or PippinHT platforms to remove short DNA fragments from the library prior to using the Chromium system. For lower-quality DNA samples, this 10x document guides users to remove smaller genomic DNA molecules, with protocols for building >20 Kb and >40 Kb libraries.
Scientists in China and the UK recently published an open-access optimized protocol for RAD-seq in the Theoretical and Applied Genetics journal. The method is targeted at large studies of plants and enables users to specify sequence coverage parameters.
From lead author Ning Jiang and collaborators, “A highly robust and optimized sequence-based approach for genetic polymorphism discovery and genotyping in large plant populations” offers a step-by-step protocol. “This optimized approach provides both a computational tool and a library construction protocol, which can maximize the number of genomic sequence reads that uniformly cover a plant genome and minimize the number of sequence reads representing chloroplast DNA and rRNA genes,” the scientists write.
The challenge with using existing RAD-seq protocols for plants, according to the authors,iphone 6 remplacement écran is that chloroplast and rRNA genes can account for the majority of sequence reads in an experiment if scientists don’t adjust for them, making this process inefficient for plant population genotyping.
In the new protocol, the team employed two size selection steps using the Pippin Prep. The workflow looks like this: digestion; ligating barcoded adapters; Pippin Prep sizing; more digestion; PCR amplification; and another size-selection step. (For details, check out this workflow graphic.)
The team validated the method through analysis of six sequencing libraries “for parental lines and their segregating offspring of both diploid and tetraploid Arabidopsis and potato,” they report. They saw balanced sequence representation across the samples. “Sequence data from the optimized RAD-seq experiments shows that the undesirable chloroplast and rRNA contributed sequence reads can be controlled at 3–10 %,” they note.
For pooling, the scientists recommend a maximum of 12 samples per sequencing library to reduce the variation in
number of sequence reads per plant.