Scientists at the University of Oregon have published a new method to detect PCR and sequencing errors that should help other researchers track rare SNPs with greater accuracy. PELE-seq, which gets our vote for best new protocol name, can be used with ddRAD-seq, targeted amplicon sequencing, and many other genotyping methods.
From lead author Jessica Preston and senior author Eric Johnson, “High-specificity detection of rare alleles with Paired-End Low Error Sequencing (PELE-Seq)” came out in BMC Genomics. The scientists embarked on this project to reduce the current error rate in NGS studies, which they peg at about 1% and say “leads to the generation of millions of sequencing errors in a single experiment.”
The team uses barcoded adapters as well as overlapping paired-end reads on size-selected DNA molecules to maximize accuracy. The barcoding process reduces false-positive SNP calls, while the overlapping reads reduce sequencing errors. The team used our Pippin Prep automated DNA sizing platform to collect tight DNA bands prior to paired-end sequencing on Illumina. Scientists tested the PELE-seq protocol on E. coli and Caenorhabditis remanei, finding improved specificity and sensitivity for accurately detecting rare variants.
“We have demonstrated that the PELE-Seq method of variant calling is highly specific at detecting rare SNPs found at below 1% in a population,” the scientists write. “There were zero instances of false positive SNPs called from PELE-sequenced control E. coli libraries containing rare alleles present at known frequencies, whereas standard NGS DNA-Seq libraries contained 30–50% false-positive SNPs.”
Is it really possible to detect somatic structural variants accurately from a single sequencing read? A new protocol from scientists at the Albert Einstein College of Medicine in New York and Voronezh State University in Russia was designed to do just that.
In the Nature Methods paper entitled “Quantitative detection of low-abundance somatic structural variants in normal cells by high-throughput sequencing,” lead author Wilber Quispe-Tintaya, senior author Alexander Maslov, and collaborators describe a method called Structural Variant Search (SVS).
“The key feature of SVS is its ability to definitively call [a structural variant] using a single sequencing read that spans the breakpoint, without the need for multiple supporting reads,” the scientists report. The workflow relies on preparing a chimera-free library and on a new algorithm that calls structural variants without using consensus data. The variant caller uses a split-read method for identifying potential structural variants, filters out artifacts, and then separates somatic from germline variants.
They demonstrate the workflow on a cell line known to harbor integration events from human papillomavirus. SVS called 20 integration sites; 17 had previously been reported, and two of the three novel findings were confirmed by PCR testing. “Most likely these two HPV integration sites had not been detected previously because of their low abundance, underscoring the unique capability of SVS to detect low-frequency [structural variants],” the authors note.
The team’s library prep procedure included size-selection on a PippinHT instrument, after which the samples were sequenced using the Ion Torrent Proton platform.
We’re pleased to report that 10x Genomics has released a new sample prep protocol for its Chromium platform that includes the BluePippin and PippinHT size selection platforms from Sage Science.
10x Genomics has gotten a lot of attention in recent years for its impressive ability to generate long-range information from short-read sequencing data, filling a major need in the scientific community. Scientists with Illumina pipelines can easily add the 10x instrument to generate another dimension of data that’s especially useful for alignment and assembly.
The 10x instrument delivers best results when it’s working from high molecular weight DNA (at least 50 Kb), with benefits including longer haplotype blocks and enhanced ability to call structural variants. In the new protocol, 10x recommends using the BluePippin or PippinHT platforms to remove short DNA fragments from the library prior to using the Chromium system. For lower-quality DNA samples, this 10x document guides users to remove smaller genomic DNA molecules, with protocols for building >20 Kb and >40 Kb libraries.
Scientists in China and the UK recently published an open-access optimized protocol for RAD-seq in the Theoretical and Applied Genetics journal. The method is targeted at large studies of plants and enables users to specify sequence coverage parameters.
From lead author Ning Jiang and collaborators, “A highly robust and optimized sequence-based approach for genetic polymorphism discovery and genotyping in large plant populations” offers a step-by-step protocol. “This optimized approach provides both a computational tool and a library construction protocol, which can maximize the number of genomic sequence reads that uniformly cover a plant genome and minimize the number of sequence reads representing chloroplast DNA and rRNA genes,” the scientists write.
The challenge with using existing RAD-seq protocols for plants, according to the authors, is that chloroplast and rRNA genes can account for the majority of sequence reads in an experiment if scientists don’t adjust for them, making this process inefficient for plant population genotyping.
In the new protocol, the team employed two size selection steps using the Pippin Prep. The workflow looks like this: digestion; ligating barcoded adapters; Pippin Prep sizing; more digestion; PCR amplification; and another size-selection step. (For details, check out this workflow graphic.)
The team validated the method through analysis of six sequencing libraries “for parental lines and their segregating offspring of both diploid and tetraploid Arabidopsis and potato,” they report. They saw balanced sequence representation across the samples. “Sequence data from the optimized RAD-seq experiments shows that the undesirable chloroplast and rRNA contributed sequence reads can be controlled at 3–10 %,” they note.
For pooling, the scientists recommend a maximum of 12 samples per sequencing library to reduce the variation in number of sequence reads per plant.
Mendelspod has turned out another terrific podcast, this one with Kari Stefansson, and we’re proud to have sponsored the thought-provoking discussion.
As most people in the field know, Stefansson earned his fame as founder of DeCode Genetics, which has spent 20 years analyzing the genetics of the Icelandic population. Now part of Amgen, the team continues to churn out publications characterizing genomic variation.
Stefansson spoke with Mendelspod host Theral Timpson about the value of studying an island population, which has a pronounced founder effect that has left many Icelanders with genetic variants that are quite rare in other populations. These variants have been associated with increased risk for Alzheimer’s disease, various forms of cancer, heart attacks, and more. There are also some protective variants, such as a heart-protective gene that has become the focus of a drug discovery program at Amgen.
We were especially interested in Stefansson’s stand on the tug-of-war between societal value of genetic information and each person’s right to privacy. He points out that advances in medicine have come from the generosity of previous patients who shared their medical data, suggesting that keeping this information private may be “antisocial.” Later in the discussion, Stefansson says that he’s been lobbying Icelandic leaders to let him identify everyone with a particularly high-risk BRCA mutation from the national genetic database so these people can be contacted and given treatment options, but opponents argue that this violates a person’s right not to know such information.
Looking forward, Stefansson said that the human brain is “the last frontier in biology” and that we have a long way to go to understand how our brains make us who we are, how they define our species, how they trigger emotions, and more. His team is combining genetic studies with cognitive testing to better understand this organ. Early findings have demonstrated that a variant linked to schizophrenia risk is also associated with creative thinking.
For all that and much more, check out the full podcast!