More and more scientists are using their PacBio systems for transcriptome studies, generating full-length isoforms with the Iso-Seq method. The number of novel transcripts discovered and the implications for alternative splicing are a not-so-subtle reminder that we still have a lot to learn about gene expression.
Full RNA transcripts have lengths up to 10kb, with the largest proportion typically falling in the 3-5 kb range. Since SMRT sequencing can read the transcript from beginning to end, PacBio recommends binning the transcripts into four size ranges for comprehensive isoform surveys; 0.8-2 kb, 2-3 kb, 3-5 kb, and 5-10 kb.
PacBio provides two template preparation protocols that feature our DNA size selection instruments: a BluePippin guide and a SageELF guide. The BluePippin collects a single size fraction from each of four samples per run, while the SageELF collects 12 contiguous size fractions from a sample and can process one or two samples per run.
Here are a few details to illustrate the differences between the platforms.
- Also validated for use with long-read and Roche/NimbleGen SeqCap template protocols
- Simplified workflow to collect one fraction from each sample
- Size cut-offs are more accurate and reproducible
- Requires >5 ug of starting DNA
- Final library size bins have more continuous overlapping, improving bioinformatics analysis
- More user flexibility for combining the size bins
- Unused fractions can be recovered and saved
- Requires 3-5 ug of starting DNA
We’re pleased that these two platforms have been helpful in the Iso-Seq workflow. A newly released 10-40 kb fractionation protocol for the SageELF should make it even more useful for long-range pipelines.
If you’re a PacBio user interested in trying out the SageELF for Iso-Seq size selection, let us know.
The Sage Science R&D team has been hard at work on our newest tool, to be released later this year. The HLS platform, which we first described at the Festival of Genomics meeting in Boston last year, is our answer to the growing need to generate high molecular weight DNA fragments directly from blood or cell suspensions for long-range sequencing.
As the sequencing community shifts its focus from short-read to long-range information — from single molecule long reads or synthetic long reads — the pressure is on for sample prep processes to adjust accordingly. Sample prep pipelines that work for 200-base fragments simply can’t scale to handle 50 kb fragments. We believe that new approaches are needed to enable workflows with high molecular weight DNA, and that’s where the HLS platform comes in.
Here’s how it works: we load samples into a gel, where we perform cell lysis, enzyme processing, and contaminant removal. Thanks to electrophoresis, this all moves much faster than it would on a regular gel, and the megabase-scale DNA is large enough to be stuck in the agarose. After purification, the DNA is lightly cleaved, allowing it to be retrieved from the gel in an automated elution process.
To see how the HLS prototype performs, check out this poster describing an experiment with human cultured cells and goat whole blood. DNA fragments extracted with the HLS were often tens of kilobases, or even megabases, long.
They may make subway riders shudder, but New York City mice are the stars of a cool new paper outlining their evolutionary history during a rapid period of urbanization. Scientists from the City University of New York used ddRAD-seq to genotype 23 populations of white-footed mice, Peromyscus leucopus, in the metropolitan area.
The publication, “Urbanization shapes the demographic history of a native rodent (the white-footed mouse, Peromyscus leucopus) in New York City,” came out in the Biology Letters journal of the Royal Society (subscription required, or check out this preprint). Lead author Stephen Harris and collaborators used the genotyping results to explore population dynamics and genomic diversity in these mice. “This study is the first to examine the impact of urbanization on demographic history using patterns of genomic variation in wild populations,” the authors write.
They found that urbanization had a major role in shaping the evolution of P. leucopus, as did climate change. “We detected bottlenecks immediately after isolation of urban populations, suggesting that a small remnant population within these parks at the time of the bottleneck provided most of the urban genetic variation found today,” Harris et al. report. In addition, they found that white-footed mice colonized Long Island shortly after the retreat of a glacier that covered it some 21,000 years ago, and that those populations were then isolated after a rising sea level separated the region from the mainland.
The team analyzed nearly 15,000 SNPs from more than 190 mice. “We assigned individuals to evolutionary clusters and then inferred recent divergence times, population size changes and migration,” the authors write, noting that population divergence timing maps closely to the urbanization pattern in New York. The team used Pippin Prep to perform automated size selection, an important step in the ddRAD-seq protocol.
“Our results show that geography, geological events and human-driven habitat change have left a detectable genomic signature in NYC’s white-footed mouse populations,” the scientists conclude.
We began shipping the high-throughput version of our automated Pippin DNA size selection platform last year, and it’s a thrill to see what we believe is the first reference to it in a peer-reviewed publication.
A team of scientists from Huazhong Agricultural University in China recently published “Multi-omics maps of cotton fibre reveal epigenetic basis for staged single-cell differentiation” in Nucleic Acids Research. In the paper, lead author Maojun Wang and colleagues tracked epigenetic modifications during development of the cotton fiber.
Through extensive testing, they found that one type of methylation increased over time, while a second type of methylation decreased, and that the same changes were not seen in nearby tissue. In addition, “integrated multi-omics analyses revealed that dynamic DNA methylation played a role in the regulation of lipid biosynthesis and spatio-temporal modulation of reactive oxygen species during fibre differentiation,” the scientists report.
They used PippinHT during MNase digestion of chromatin, purifying the digested samples and then size-selecting for 100 bp – 200 bp fragments. Those were later run in MNase-seq and ChIP-seq pipelines on an Illumina HiSeq instrument.
Congrats to the authors of this publication for their very cool epigenetic findings, and also to PippinHT for making it into the scientific literature!
It’s been a few years since the Hoekstra lab at Harvard first published its double-digest RAD-seq protocol. Since then, the approach has been rapidly adopted by the community for massively parallel genotyping, particularly of non-model organisms, and has been the foundation for lots of new protocol and tool development.
ddRAD-seq was itself an iteration on the original RAD-seq (restriction-site associated DNA sequencing) method from the Cresko lab at the University of Oregon. It introduced an important tool for people interested in using the second digest step, which relied on automated size selection with the Pippin Prep to generate useful results.
In addition to the widespread use of ddRAD-seq for organisms ranging from fish to mosquitos, the community has continued to develop and expand RAD-based protocols. The recently published hyRAD protocol adds a hybridization capture step to make the approach useful with degraded DNA samples, such as those found in museum collections. EpiRADseq swaps in a methylation-sensitive restriction enzyme, enabling scalable and cost-effective quantification of methylation across whole genomes. And SimRAD is a novel software tool designed to improve ddRAD-seq results by accurately estimating the number of loci generated. Meanwhile, scientists have optimized ddRAD-seq methods for Ion Torrent sequencing and reviewed best practices for RAD-based approaches in general. One team compared ddRAD-seq to sequence capture and found that the RAD-based method generated more data for less money, noting that it would be particularly valuable for organisms without existing genome resources. This paper compared sequence data to SNP data from ddRAD-seq projects for phylogenetic inference, yielding advice for making phylogenetic trees from SNP data more accurate.
For a look at some great findings from recent ddRAD-seq studies, check out these papers: