Citations

The haplotype-resolved T2T carnation (Dianthus caryophyllus) genome reveal the correlation between genome architecture and gene expression

November 2023

Authors:
Lan Lan, Luhong Leng, Weichao, Yonglin Ren, Wayne Reeve, Xiaopeng Fu, Zhiqiang Wu, Xiaoni Zhang
Abstract:
“Carnation (Dianthus caryophyllus) is one of the most valuable commercial 45 flowers, due to its richness of colour and form, and its excellent storage and vase life. 46 The diverse demands of the market require faster breeding in carnations. A full 47 understanding of carnations is therefore required to guide the direction of breeding. 48 Hence, we assembled the haplotype-resolved gap-free carnation genome of a variety 49 ‘Baltico’ which is the most common white standard variety worldwide. Based on the 50 high-depth HiFi, ultra-long nanopore and Hi-C sequencing data, we assembled the 51 telomere-to-telomere (T2T) genomes to be 564,479,117 and 568,266,215 bp, for the 52 two haplotypes Hap1 and Hap2, respectively. This T2T genome exhibited great 53 improvement in genome assembly and annotation results compared with the former 54 version. The improvements were seen when different approaches to evaluation were 55 used. Our T2T genome first informs the analysis of the telomere and centromere 56 region, enabling us to speculate about the specific centromere characteristics that 57 cannot be identified by high order repeats in carnations. We analyzed the allele-58 specific expression in three tissues and the relationship between the genome 59 architecture and gene expression in the haplotypes. This demonstrated that the length 60 of the genes, CDS, introns, the exon numbers and the transposable elements insertions 61 correlate with gene expression ratios and levels. The insertions of transposable 62 elements repress expression in gene regulatory networks in carnation. This gap-free 63 finished T2T carnation genome provides a valuable resource to illustrate the genome 64 characteristics and functional genomics analysis for further studies and molecular 65 breeding. “

Sage Science Products:
The SageHLS instrument was used to size select Ultra High Molecular Weight DNA for Oxford Nanopore Promethion sequencing (Genome Center of Grandomics, Wuhan China)

Author Affiliations:

Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Chinese Academy of Agricultural Sciences, Shenzhen, China

College of Science, Health, Engineering and Education, Murdoch University, Western Australia

Kunpeng Institute of Modern Agriculture at Foshan, Shenzhen Branch, Chinese Academy of Agricultural Sciences, Shenzhen, China

Key Laboratory of Horticultural Plant Biology, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, China

Preprint – Horticulture Research
DOI: 10.1093/hr/uhad244

Posted in Citation | Tagged , , , , , | Comments Off on The haplotype-resolved T2T carnation (Dianthus caryophyllus) genome reveal the correlation between genome architecture and gene expression

The draft genome sequence of the Japanese rhinoceros beetle Trypoxylus dichotomus septentrionalis towards an understanding of horn formation

May 2023
Authors:
Shinichi Morita, Tomoko F. Shibata, Tomoaki Nishiyama, Yuuki Kobayashi, Katsushi Yamaguchi, Kouhei Toga, Takahiro Ohde, Hiroki Gotoh, Takaaki Kojima, Jesse N. Weber, Marco Salvemini, Takahiro Bino, Mutsuki Mase, Moe Nakata, Tomoko Mori, Shogo Mori, Richard Cornette, Kazuki Sakura, Laura C. Lavine, Douglas J. Emlen, Teruyuki Niimi and Shuji Shigenobu

Abstract:
“The Japanese rhinoceros beetle Trypoxylus dichotomus is a giant beetle with distinctive exaggerated horns present on the head and prothoracic regions of the male. T. dichotomus has been used as a research model in various fields such as evolutionary developmental biology, ecology, ethology, biomimetics, and drug discovery. In this study, de novo assembly of 615 Mb, representing 80% of the genome estimated by flow cytometry, was obtained using the 10 × Chromium platform. The scaffold N50 length of the genome assembly was 8.02 Mb, with repetitive elements predicted to comprise 49.5% of the assembly. In total, 23,987 protein-coding genes were predicted in the genome. In addition, de novo assembly of the mitochondrial genome yielded a contig of 20,217 bp. We also analyzed the transcriptome by generating 16 RNA-seq libraries from a variety of tissues of both sexes and developmental stages, which allowed us to identify 13 co-expressed gene modules. We focused on the genes related to horn formation and obtained new insights into the evolution of the gene repertoire and sexual dimorphism as exemplified by the sex-specific splicing pattern of the doublesex gene. This genomic information will be an excellent resource for further functional and evolutionary analyses, including the evolutionary origin and genetic regulation of beetle horns and the molecular mechanisms underlying sexual dimorphism.”

Sage Science Products:
The SageHLS instrument was used to size select DNA between 50-80 kb for 10X Genomics Chromium linked read analysis.

Author Affiliations:
Division of Evolutionary Developmental Biology, National Institute for Basic Biology, Okazaki, Japan
Department of Basic Biology, School of Life Science, The Graduate University for Advanced Studies, SOKENDAI, Okazaki, Japan
Division of Evolutionary Biology, National Institute for Basic Biology, Okazaki, Japan
Division of Integrated Omics Research, Research Center for Experimental Modeling of Human Disease, Kanazawa University, Kanazawa, Japan
Laboratory of Evolutionary Genomics, National Institute for Basic Biology, Okazaki, Japan
Trans-Omics Facility, National Institute for Basic Biology, Okazaki, Japan
Laboratory of Sericulture and Entomoresources, Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya, Japan
URA Division, Office of Research and Academia-Government-Community Collaboration, Hiroshima University, Hiroshima, Japan
Department of Applied Biosciences, Graduate School of Agriculture, Kyoto University, Kyoto, Japan
Department of Biological Science, Faculty of Science, Shizuoka University, Shizuoka, Japan
Laboratory of Molecular Biotechnology, Graduate School of Bioagricultural Sciences, Nagoya University, Nagoya, Japan
Department of Agrobiological Resources, Faculty of Agriculture, Meijo University, Nagoya, Japan
Department of Integrative Biology, University of Wisconsin-Madison, Madison, WI, USA
Department of Biology, University of Naples Federico II, Naples, Italy
Institute of Agrobiological Sciences, National Agriculture and Food Research Organization, Tsukuba, Japan
Department of Entomology, Washington State University, Pullman, WA, USA
Division of Biological Sciences, The University of Montana, Missoula, MT, USA

Citation
Nature Scientific Reports
DOI:10.1038/s41598-023-35246-w

Posted in Citation | Tagged , , , | Comments Off on The draft genome sequence of the Japanese rhinoceros beetle Trypoxylus dichotomus septentrionalis towards an understanding of horn formation

Targeted Phasing of 2-200 Kilobase DNA Fragments with a Short-Read Sequencer and a Single-Tube Linked-Read Library Method

March 2023
Authors:
Veronika Mikhaylova, Madison Rzepka, Tetsuya Kawamura1, Yu Xia, Peter L. Chang, Shiguo Zhou, Long Pham, Naisarg Modi, Likun Yao, Adrian Perez-Agustin, Sara Pagans, T. Christian Boles , Ming Lei, Yong Wang, Ivan Garcia-Bassets, and Zhoutao Chen

Abstract:
“In the human genome, heterozygous sites are genomic positions with different alleles inherited from each parent. On average, there is a heterozygous site every 1-2 kilobases (kb). Resolving whether two alleles neighboring heterozygous positions are physically linked—that is, phased—is possible with a short-read sequencer if the sequencing library captures long-range information. TELL-Seq is a library preparation method based on millions of barcoded micro-sized beads that enables instrument-free phasing of a whole human genome in a single PCR tube. TELL-Seq incorporates a unique molecular identifier (barcode) to the short reads generated from the same high-molecular-weight (HMW) DNA fragment (known as ‘linkedreads’). However, genome-scale TELL-Seq is not cost-effective for applications focusing on a single locus or a few loci. Here, we present an optimized TELL-Seq protocol that enables the cost-effective phasing of enriched loci (targets) of varying sizes, purity levels, and heterozygosity. Targeted TELL-Seq maximizes linked-read efficiency and library yield while minimizing input requirements, fragment collisions on microbeads, and sequencing burden. To validate the targeted protocol, we phased seven 180-200 kb loci enriched by CRISPR/Cas9-mediated excision coupled with pulse-field electrophoresis, four 20 kb loci enriched by CRISPR/Cas9-mediated protection from exonuclease digestion, and six 2-13 kb loci amplified by PCR. The selected targets have clinical and research relevance (BRCA1, BRCA2, MLH1, MSH2, MSH6, APC, PMS2, SCN5A-SCN10A, and PKI3CA). These analyses reveal that targeted TELL-Seq provides a reliable way of phasing allelic variants within targets (2-200 kb in length) with the low cost and high accuracy of short-read sequencing. Lynch syndrome (LS), caused by heterozygous pathogenic variants affecting one of the mismatch repair (MMR) genes (MSH2, MLH1, MSH6, PMS2), confers moderate to high risks for colorectal, endometrial, and other cancers. We describe a four-generation, 13-branched pedigree in which multiple LS branches carry the MSH2 pathogenic variant c.2006G>T (p.Gly669Val), one branch has this and an additional novel MSH6 variant c.3936_4001+8dup (intronic), and other non-LS branches carry variants within other cancer-relevant genes (NBN, MC1R, PTPRJ). Both MSH2 c.2006G>T and MSH6 c.3936_4001+8dup caused aberrant RNA splicing in carriers, including out-of-frame exon-skipping, providing functional evidence of their pathogenicity. MSH2 and MSH6 are co-located on Chr2p21, but the two variants segregated independently (mapped in trans) within the digenic branch, with carriers of either or both variants. Thus, MSH2 c.2006G>T and MSH6 c.3936_4001+8dup independently confer LS with differing cancer risks among family members in the same branch. Carriers of both variants have near 100% risk of transmitting either one to offspring. Nevertheless, a female carrier of both variants did not transmit either to one son, due to a germline recombination within the intervening region. Genetic diagnosis, risk stratification, and counseling for cancer and inheritance were highly individualized in this family. The finding of multiple cancer-associated variants in this pedigree illustrates a need to consider offering multicancer gene panel testing, as opposed to targeted cascade testing, as additional cancer variants may be uncovered in relatives.”

Sage Science Products:
The HLS-CATCH process (SageHLS system) was used to purify the larger (180-200kb) targets.

Author Affiliations:

Universal Sequencing Technology Corp., Carlsbad, CA
Sage Science Inc., Beverly, MA
Department of Medicine, University of California, San Diego, La Jolla, CA
Department of Medical Sciences, School of Medicine, University of Girona, Girona, Spain
Universal Sequencing Technology Corp., Canton, MA

BioRxiv preprint
DOI: 10.1101/2023.03.05.531179

Posted in Citation | Tagged , , | Comments Off on Targeted Phasing of 2-200 Kilobase DNA Fragments with a Short-Read Sequencer and a Single-Tube Linked-Read Library Method

Innovations in double digest restriction-site associated DNA sequencing (ddRAD-Seq) method for more efficient SNP identification

February 2023

Authors:
Zenaida V.Magbanua, Chuan-YuHsu, OlgaPechanova, Mark Arick II, Corrinne E.Grover, Daniel G.Peterson

Abstract:
“We present an improved ddRAD-Seq protocol for identifying single nucleotide polymorphisms (SNPs). It utilizes selected restriction enzyme digestion fragments, quick acting ligases that are neutral with the restriction enzyme buffer eliminating buffer exchange steps, and adapters designed to be compatible with Illumina index primers. Library amplification and barcoding are completed in one PCR step, and magnetic beads are used to purify the genomic fragments from the ligation and library generation steps. Our protocol increases the efficiency and decreases the time to complete a ddRAD-Seq experiment. To demonstrate its utility, we compared SNPs from our protocol with those from whole genome resequencing data from Gossypium herbaceum and Gossypium arboreum. Principal component analysis demonstrated that the variability of the combined data was explained by the genotype (PC1) and methodology applied (PC2). Phylogenetic analysis showed that the SNPs from our method clustered with SNPs from the resequencing data of the corresponding genotype. Sequence alignments illustrated that for homozygous loci, more than 90% of the SNPs from the resequencing data were discovered by our method. Our analyses suggest that our ddRAD-Seq method is reliable in identifying SNPs suitable for phylogenetic and association genetic studies while reducing cost and time over known methods.”

Sage Science Products:
The SageELF to size select library fractions to 295 and 614 bp.

Author Affiliations:
Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, MS
Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA

Analytical Biochemistry
DOI: 10.1016/j.ab.2022.115001

Posted in Citation | Tagged , | Comments Off on Innovations in double digest restriction-site associated DNA sequencing (ddRAD-Seq) method for more efficient SNP identification

Utility of long-read sequencing for All of Us

January 2023

Authors:
M. Mahmoud, Y. Huang, K. Garimella, P. A. Audano, W. Wan, N. Prasad, R. E. Handsaker, S. Hall, A. Pionzio, M. C. Schatz, M. E. Talkowski, E. E. Eichler, S. E. Levy, F. J. Sedlazeck

Abstract:
“The All of Us (AoU) initiative aims to sequence the genomes of over one million Americans from diverse ethnic backgrounds to improve personalized medical care. In a recent technical pilot, we compared the performance of traditional short-read sequencing with long-read sequencing in a small cohort of samples from the HapMap project and two AoU control samples representing eight datasets. Our analysis revealed substantial differences in the ability of these technologies to accurately sequence complex medically relevant genes, particularly in terms of gene coverage and pathogenic variant identification. We also considered the advantages and challenges of using low coverage sequencing to increase sample numbers in large cohort analysis. Our results show that HiFi reads produced the most accurate results for both small and large variants. Further, we present a cloud-based pipeline to optimize SNV, indel and SV calling at scale for long-reads analysis. These results will lead to widespread improvements across AoU.”

Sage Science Products:
The PippinHT was used to size select PacBio HiFi libraries with a target range between 15-22kb.

Author Affiliations:
Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX,
Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA
The Jackson Laboratory for Genomic Medicine, Farmington, CT
Discovery Life Sciences, Huntsville, AL
Department of Genetics, Harvard Medical School, Boston, MA
Department of Computer Science, Johns Hopkins University, Baltimore, MD
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA
Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA
Genome Sci, University of Washington, Seattle, WA
Howard Hughes Medical Institute, University of Washington, Seattle, WA
Hudson Alpha Institute for Biotechnology, Huntsville, AL
Department of Computer Science, Rice University, Houston, TX

bioRxiv preprint
DOI: 10.1101/2023.01.23.525236

Posted in Citation | Tagged , , , | Comments Off on Utility of long-read sequencing for All of Us