NEXT GENERATION GENOTYPING (NGG) FOR POPULATION SCALE GENOMIC STUDIES USING RIPTIDE™ DNA LIBRARY PREPARATION
Keith Brown,3 Azeem Siddique,1,3 Gaia Suckow,1,3 Nils Homer,2 Jay Carey,2 Phillip Ordoukhanian,1,3 Steve Head,1,3 Joseph Pickrell5 , Ryan Kim5 1The Scripps Research Institute, La Jolla, CA, USA; 2Fulcrum Genomics, Somerville, MA, USA; 3iGenomX, Carlsbad, CA; 4Macrogen, Rockville, MD, USA; 5Gencove, New York, NY, USA
Whole Genome Shotgun Sequencing has become the tool of choice for microbial genome analysis. Rapidly declining costs of sequencing, data analysis, data storage and database access will continue to drive adoption. Library construction has not kept pace with these advancements, with costs of preparing a next generation sequencing (NGS) library often exceeding the cost of sequencing. Popular methods of library construction for NGS include fragmentation, end-repair and adapter ligation, and transposase-mediated adapter insertion. The RIPTIDE High Throughput Rapid DNA Library Prep is distinctly different in its approach because it relies on polymerase-mediated primer extension for library preparation. The initial step of the prep, involving primer extension with barcoded random primers, is performed in a 96-well plate. Each well of the plate contains primers with a unique barcode; consequently, the library generated from each well is uniquely identifiable and can be bioinformatically traced back to the original sample after sequencing. Following this step, the primer extension products are combined into one pool and all subsequent steps, including second strand synthesis and PCR, are performed with the single pool. The library prep is fast, easily automatable and can be tuned to genomes of high and low GC content. With automation, 960 samples can be processed in a single day. The technology will aid genetic research by helping to increase sample throughput and by reducing processing steps and operating costs. Presented here is RipTide High Throughput Rapid DNA Library Prep sequencing data generated from multiple microbial genomes.
Schematic of the RipTide High-Throughput Library Prep
(A) The DNA of interest is denatured and barcoded random primers with partial Illumina P5 adapter sequences (termed “Primer A”) are used for primer extension with a DNA polymerase. Since each primer extension reaction occurs in one well of a 96-well plate and each well contains a uniquely barcoded primer, the library generated from each well is uniquely identifiable. The nucleotide mix contains a small fraction of dideoxynucleotides, which causes the primer extension reaction to selfterminate at lengths amenable to sequencing. (B) Because the dideoxynucleotides are covalently bound to biotin, the products of primer extension can be captured by streptavidin beads while other DNA fragments are washed away. (C) A second round of primer extension with a random primer with partial P7 adapter sequence (termed “Primer B”) is performed on the bead-captured DNA molecules to create the complementary strand. The newly created strand has Illumina adapter sequences at both ends of the molecule. (D) PCR is performed with full length Illumina adapters. (E) This generates a double-stranded DNA library, which must undergo a final size selection before it can be loaded on a sequencing instrument.
Read count data for 96 replicates of three bacterial samples. No yield normalization of individual replicates performed.
50 ng of genomic DNA from a single organism was used as input for 96 primer extension reactions in a 96-well plate. Individually barcoded products of the primer extension reaction were combined for downstream library preparation steps. The final library was size selected and run on an Illumina NextSeq instrument for 2 x 150 nucleotide paired-end sequencing. Data was demultiplexed by index barcode (i.e., the barcode unique to each plate/ organism) and by in-line barcode (i.e., sample barcode) to obtain read count data. The histograms depict sequencing read numbers for 96 replicate samples of genomic DNA from Clostridium diffcile, Escherichia coli & Burkholderia cepacia, a low GC, medium GC and high GC bacterium, respectively.
Schematic of the RipTide High-Throughput Library Prep
Key sequencing metrics of the RipTide libraries made from 96 replicates of C. diffcile, E. coli and B. cepacia (as described in the legend of the previous figure) and additional libraries made from Staphylococcus aureus, Helicobacter pylori, Klebsiella pneumoniae, Pseudomonas putida and Micrococcus luteus are shown in these two tables. Sequencing data was demultiplexed and analyzed for alignment rates, duplication rates and coverage statistics. All data shown is the mean and standard deviation of 96 demultiplexed replicates of each organism. All of the libraries (960 in total) were sequenced on a single NextSeq 2 x 150 bp flow cell.
Microbiome: Read alignment data for six mixed DNA samples
(A) 50 ng of a DNA standard representing a community of ten microbial organisms (from Zymo Research) was used as input in three RipTide library preps, each performed with twelve replicate samples. (B) A simulated microbiome was created by mixing equal quantities of genomic DNA from seven different bacterial species. 50 ng of this DNA mixture was used for RipTide library preparation, as in (A). Library preps in (A) and (B) were performed with (1) an A primer with low GC content in the 3’ random sequence, (2) an A primer with high GC content in the random sequence, and (3) a 1:1 combination of the two primers. In both graphs, the left-most column represents the actual species composition of the DNA standard. Adjacent bars depict the species composition of the RipTide libraries after sequencing and alignment of read data to the species in question. From left to right, the datasets constitute (i) combined data from all three sets of library preps, (ii) data from library preps using the 1:1 combination of low and high GC primers, (iii) data from library preps using the low GC primer only, (iv) data from library preps using the high GC primer only. Data for all twelve replicates of each condition is presented here. All 72 libraries were sequenced on a single Illumina MiSeq 2 x 300 bp flow cell.
• Fundamentally different approach to library preparation allows for high-throughput processing of samples in a fast, convenient manner.
• NO fragmentation, NO end repair, NO ligation or associated clean-up steps are required.
• Tuneable to samples of different GC content.
• Ideal for preparation of libraries from plasmids, synthetic constructs, small genomes and microbiome samples.
• Original content of complex multi-species samples is well-preserved in final sequencing data.