Lab protocol of the Brief Communication, "epiGBS: a reference-free reduced representation bisulfite sequencing technique."
EpiGBS lab protocol
This work is licensed under a CC BY-NC 3.0 License
This protocol has been posted on Protocol Exchange, an open repository of community-contributed protocols sponsored by Nature Portfolio. These protocols are posted directly on the Protocol Exchange by authors and are made freely available to the scientific community for use and comment.
posted 15 Feb, 2016
You are reading this latest protocol version
We describe a reduced representation bisulfite sequencing method, named epiGBS, for cost-effective exploration and comparative analysis of DNA methylation and genetic variation in hundreds of samples of species de novo, without requiring a reference genome. This method uses genotyping-by-sequencing of bisulfite-converted DNA, followed by reliable de novo reference construction, mapping, variant calling and SNP/methylation variation distinction. The output can be loaded directly in IGV for visualization, and in RnBeads for analysis of differential methylation.
DNA isolation kit : Macherey-Nagel nucleospin plant II
Qubit® 2.0 Fluorometric dsDNA HS Assay Kit (Q32851 Life technologies)
PstI (NEB, R0140S)
BSA (NEB, B9000S)
T4 DNA ligase (NEB, M0202M/L)
non-phosphorylated adapters (see supplementary data manuscript)
Qiaquick PCR cleanup (Qiagen, 28104)
Agencourt AMPure XP (Beckman coulter, A63880)
5-methylcytosine dNTP Mix (Zymo research, D1030)
DNA polymerase I (NEB, M0209S)
EZ DNA Methylation-Lightning™ Kit (Zymo Research)
5 µL KAPA HiFi HotStart Uracil+ ReadyMix (Kapa Biosystems)
illumina PE PCR Primer (see supplementary data manuscript)
High Sensitivity DNA chip on a 2100 Bioanalyzer system (Agilent)
HiSeq v4 reagents
Qubit® 2.0 Fluorometer
Biometra Tgradient (ramping speed 5 degrees / s)
2100 Bioanalyzer system (Agilent)
Illumina Hiseq 2500 sequencer; Rapid Run Mode Paired-End sequencing; HiSeq Control Software (v2.2.38)
A - DNA Extraction
B: PstI Restriction digestion
C : Adapter ligation
To minimize the possibility of misidentifying samples as a result of sequencing or adapter synthesis error, all pair-wise combinations of barcodes differed by a minimum of three mutational steps, barcode lengths were modulated from 4 to 6 bp to maximize the nucleotide balance of the bases at each position in the overall set of sequencing reads (manuscript : Supplementary Fig. 1d). Samples were pooled and processed per species after ligation.
D: Cleanup and size selection
E: Nick translation
Due to the use of non-phosphorylated adapters, epiGBS libraries contain nicks between the 3’ fragment overhang and the 5’ non-phosphorylated adapter nucleotide.
Optional GBS PCR
At this stage an optional GBS PCR can be performed to check the library quality.
F: Bisulphite treatment and purification
For bisulfite treatment 20 µL of the nick-translated library was used.
G: EpiGBS PCR
Perform library amplification per species in four individual 10 µL reactions containing 1 µL ssDNA template, 5 µL KAPA HiFi HotStart Uracil+ ReadyMix (Kapa Biosystems), 3 pmol of each illumina PE PCR Primer (manuscript : Supplementary Table 1b). Temperature cycling consisted of 95°C for 3 min followed by 18 cycles of 98°C for 10 s, 65°C for 15 s, 72°C for 15 s with a final extension step at 72°C for 5 min.
Pool the replicate PCR products and quantify using a Qubit® dsDNA HS Assay Kit (Life Technologies).
Assess the quality of the Libraries analyzing 1 µL on a High Sensitivity DNA chip on a 2100 Bioanalyzer system (Agilent). Libraries were considered suitable for sequencing if the majority of DNA fragments were between 150–400 bp and no adapter dimers were found. Typically, epiGBS PCR reactions of 18 cycles of a non-pooled plant sample yield 3-12 ng/µL of PCR-product.
When the ‘per species’ pooled libraries pass quality control they can be further pooled according to concentration and number of samples in the species pool so that each individual sample was expected to yield an equal number of clusters on the Illumina flowcell.
Perform A ‘nano run’ on the Illumina MiSeq to quantify per-sample expected relative read count yield. Based on the read counts obtained from this run, pool the individual nick-translated digestion-ligations in such a manner that an equal number of reads would be expected per individual.
Finally, perform Rapid Run Mode Paired-End sequencing on an Illumina HiSeq2500 sequencer using the HiSeq v4 reagents and the latest version of the HiSeq Control Software (v2.2.38), which optimizes the sequencing of low-diversity libraries (http://res.illumina.com/documents/products/technotes/technote-hiseq-low-diversity.pdf). As the first five cycles of a sequencing run are used to calculate the color matrix, our barcode design achieves almost perfect balance of the first 5 nucleotides when equal numbers of sequences are obtained per forward read or “A” barcode. The reverse read or “B” barcodes do not have this requirement, hence only barcodes of four nucleotides were used.
Methods Csp6I Laboratory work
Construct the Csp6I epiGBS libraries in similar fashion as the PstI epiGBS libraries with the following modifications: The restriction digestion reaction contained 1x FD buffer and 4 µL / 40 units of Csp6I (ThermoFisher Scientific, FD0214). The ligation reaction contained 2400 pg of both A and B adapters (both adjusted for the Csp6I sticky end). While in the PstI protocol we used fully methylated adapters (both strand I and II methylated) for the Csp6I protocol we used hemi-methylated adapters. The adapter strands that were resynthesized (incorporating 5mC dNTP’s) by nick translation were not methylated as all cytosines are replaced by methylated 5mC (manuscript : see Supplementary Fig. 10). Final library amplification for Csp6I yielded 4 - 8 ng / µL product for an epiGBS PCR of 18 cycles of a library only containing Arabidopsis sample A29.
Standard : 3 days excluding dna isolation
Using optional Miseq nano sequencing before final pooling: 4 days excluding dna isolation
Essential for the protocol is perfect digestion of the gDNA. This can be inhibited by poor quality DNA and ethanol residues in gDNA sample.
The best way to aproach this protocol is to start early in the week so the whole protocol can be performed in a 3 day sequence:
day 1: digestion;
day 2: ligation;
day 3: pooling/purification, size selection, nick repair, BS conversion and purification, EpiGBS pcr.
The overnight digestion / ligation might not be neccesary for all species.
When pooling the adapter ligated reactions only a subsample is pooled (amount dependent on number of samples). If perfect read distribution is wanted an option miseq nano run can be performed with subsequent repooling according to individual read counts.
PCR will give sufficient product after 14 to 18 pcr cycles, depending on the amount of samples pooled and the species used.
See also figure 1a [step 8] of manuscript: Lab protocol for typical epiGBS library