Friday, October 26, 2012

Sequencing low diversity libraries on Illumina MiSeq

Next Generation sequencing has been used increasingly to investigate sample diversity in amplicon libraries, be it cancer panels, viral species identification or microbial community profiling and has significantly increased estimates of microbial diversity. As the demand has increased to sequence more samples, researchers have looked to the Illumina MiSeq platform for 16S analysis. Using paired-end sequencing at 150 bases (soon to increase to 2 X 250 bases) and generating amplicons of ~300 bp it is possible to overlap data and generate long pseudo-reads. One issue with the Illumina platform is that sequencing libraries of low-diversity can result in low yields and low per-base quality. Nick Loman has developed this simple work-around to spike in a genomic, higher-diversity sample and addresses the three main areas in which low-diversity samples can cause problems on the MiSeq. (1) Focusing, at every cycle. The MiSeq  focuses on the T channel, with a secondary focus to the C channel.  A spike-in of as little as 5% PhiX provides sufficient diversity to prevent any focusing issues irrespective of the library composition. (2) Template building (cycles 1 through 4) and registration (every cycle). Some signal must be present in each channel for RTA to properly do template generation and registration. Again, as little as 5% PhiX prevents these problems as long as cluster density is <700,000. (3) Phasing and matrix estimation, cycles 1 through 12. Average color matrix is measured for cycles 1 through 4, and Phasing is determined for the first 12 cycles. Low diversity samples can cause problems with both if the intensity is not evenly distributed across all channels. A larger spike in of PhiX may be required to address this problem. Details on the configuration used with MiSeq Control Software 1.2.3, and RTA 1.14.23 are given, with the following disclaimer: “This is not a configuration supported by Illumina, so use it at your own risk.”

Wednesday, October 17, 2012

Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress

Oxidative stress causes multiple gene expression changes in a cell, which include increased loading of ribosomes onto genes that can modulate gene expression.  Gerashchenko et al. created an oxidative stress condition in yeast (Saccharomyces cerevisiae) using exposure to hydrogen peroxide. Using ribosome profiling in combination with standard mRNA-Seq, quantitative expression can be determined using a simple process to quantify partial transcripts for evaluating expression level changes.  Ribosome profiling uses differential RNA isolation/RNA protection to show not only the changes in the amount of gene-specific RNAs but also the location of the protected transcripts along the length of the mRNA. Ribosomes were isolated and mRNA/ribosome complexes were treated with an RNAse to remove naked RNA in between bound ribosomes.  The existing ribosome-protected RNAs were then extracted to generate small RNA pieces that can be used to make libraries that can be sequenced, quantified and characterized. After extraction, the protected RNAs were tailed with poly-A polymerase, followed by reverse transcription with primers that contained sequencing barcode adapters. The cDNA products (~92 nucleotides) were circularized using CircLigase II ssDNA Ligase, and PCR amplified (after removing linear DNA),  using primers that contained Illumina-based Tru-Seq sequences. The 120-bp PCR products were quantified, and applied to a HiSeq flow cell and sequenced; sequence information was aligned to a S. cerevisiae reference genome, followed by transcript quantification and annotation.  This allowed the determination of the number and types of up-regulated and down-regulated genes as a result of oxidative stress.  Under stress, the ribosomes were loading onto certain open reading frames in locations 5' to the normal translation starts. This ribosomal occupancy included mRNAs that had non-AUG codon translation starts at the beginning of the ORF, and increased the duration of the protein elongation steps. Many different genes started translation along sequences in advance of the AUG start site, and this was determined to be a translation control mechanism in response to stress. Ribosomal footprinting allows analysis of only short RNAs protected by protein post- RNA isolation and these protected RNA oligos can be readily sequenced and quantified to determine the quantity and location of the ribosome loading onto given transcripts to determine the effect of oxidative stress on gene transcription and protein translation.

ResearchBlogging.orgGerashchenko MV et al. (2012). Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress. Proceedings of the National Academy of Sciences of the United States of America PMID: 23045643

Thursday, October 4, 2012

Sequencing Ancient DNA

Next Generation sequencing continues to address previously unanswered questions. In a recent publication in Science, Meyer et al. describes the sequencing of DNA isolated from a fragment of bone more than 50,000 years old, found in a Siberia’s Denisova Cave. Ancient DNA is highly degraded and generally breaks down to single strands during purification, thus it was necessary to start with single-stranded DNA (ssDNA). Researchers used a novel enzyme, Circligase™ II ssDNA Ligase, to attach a single-stranded oligomer tag to the ssDNA fragments isolated from the bone.  The tagged DNA was immobilized, and enzymatically copied. An Illumina® Genome Analyzer IIx platform was used with 76-bp, double-index reads. The result was that more than 99% of the nucleotides were sequenced at least 10 times and 92% were sequenced at least 20 times. The quality of the genome information allowed for the determination that the Denisovan female had brown eyes, hair and skin, had 23 pairs of chromosomes consistent with modern humans, and sets the stage for dating fossils by their genomes.


ResearchBlogging.orgMeyer M et al. (2012). A High-Coverage Genome Sequence from an Archaic Denisovan Individual. Science (New York, N.Y.) PMID: 22936568