Preparation of reagents
a. The TSO oligo is reconstituted in “THE RNA Storage Solution” at a concentration of 1200 µM. The information sheet from the manufacturer usually provides a dilution volume for a solution with a 100 µM oligo concentration. To create the solution with the 1200 µM oligo concentration, adjust the dilution volume accordingly by reducing 12X times the recommended volume presented on the information sheet. Then 1 ul is diluted in 99 µl of RNA storage solution (100X dilution; final concentration: 12 µM) and stored in aliquots at 5.6 µl per tube. The aliquots are stored at -80 °C. The TSO ribonucleotides are prone to degradation. Loss of the ribonucleotides will lead in considerable reduction/absence of cDNA yield.
b. The PolyT primer is reconstituted in nuclease free H20 at a concentration of 1200 µM. The information sheet from the manufacturer usually provides a dilution volume for a solution with a 100 µM oligo concentration. To create the solution with the 1200 µM oligo concentration, adjust the dilution volume accordingly by reducing 12X times the recommended volume presented on the information sheet. Then 1 µl is diluted in 99 µl of nuclease free H20 and stored in aliquots at 7 µl per tube. The aliquots can be stored at -80 °C.
RNA quantification
Total RNA can be quantified using the “Qubit RNA HS Assay Kit” according to manufacturer instructions.
Assess DNA contamination in the RNA extraction
DNA contamination can be measured using the Qubit dsDNA HS Reagent.
Removal of DNA contamination from total RNA
Use the DNA-free DNA Removal according to manufacturer instructions for the removal of DNA from RNA samples.
Assess the profile of the extracted RNA
Total RNA profile is determined using the Agilent RNA Screentape following manufacturer instructions except that the samples are not heated at 72 °C.
Spike-In RNA
ERCC (ERCC RNA Spike-In Mix 1) is added during the cDNA synthesis step. Aim to obtain a final percentage of 5 % of the reads assigned to ERCCs assuming that ploy(A) fraction of total RNA is 5 %. Target the sequenced reads of the spiked-in RNA to be 5% of the total amount of sequenced reads). The amount of spiked RNA (massspiked RNA) that is going to be added in the reaction mix can be calculated as follows:
mass(spiked RNA)=(fraction(spiked reads) × fraction(target RNA) × massRNAinput)/Total_RNA_extracted
where:
mass spiked RNA: mass (ngs) of spike-in RNA (SIRVs or ERCC) to be added in the sample.
fraction spiked reads: desired fraction of sequenced spike-in RNA reads relative to the total amount of sequenced reads.
fraction target RNA: fraction of the total RNA used in the sample, that is going to be synthesized into cDNA molecules.
mass RNA input: mass (ngs) of RNA input per sample.
Then the volume (µl) of spike-in RNA to be used is calculated as follows:
volume(spike-in RNA) = mass(spike-in RNA)/(concentration(spike-in RNA) )
where
concentration(spike-in RNA): concentration (ngs/µl) of the spike-in RNAs solution.
volumespike-in RNA: volume (µl) from the spike-in RNAs solution to be added into the sample.
The value for the “mass RNA input” is mass RNA input =300 ngs ( 300 ngs of total RNA is used in the cDNA synthesis reactions).
For the ERCC RNA Spike-In Mix 1, the mass spiked RNA = 0.45 ngs
The concentration of the stock solutions are:
· The “ERCC RNA Spike-In Mix 1” tube contains 10 µl of ERCC RNAs at a concentration of 103.515 fmoles/ul or 30.3 ng/µl .
· Prepare the appropriate dilution of each Spike-In Mix needed. In the new diluted solution the “mass spiked RNA“ for either the “ERCC RNA Spike-In Mix 1” or the “Spike-in RNA Variant (SIRVs) Control set 3 kit” should correspond, if possible, to 0.1 ul of the final diluted volume. So we need to have the following dilutions:
For the “ERCC RNA Spike-In Mix 1” we are going to dilute 6.72 times the stock solution. So, in 5.72 µl of “THE RNA solution” add 1 µl from the “ERCC RNA Spike-In Mix 1” stock solution (new concentration= 4.5 ng/ul).
Afterwards we will have to take Volume spike-in RNA = ((0.45 ngs)/(4.5 ng/µl))=0.1 µl of the diluted solution.
cDNA Library generation and sequencing on MinION
Generally, follow the ONT “1D Strand switching cDNA by ligation (SQK-LSK108)” protocol but with custom cDNA synthesis protocol (as described below), and the end repair and d(A) tailing steps are performed separately. An overview of the protocol as follows:
1. cDNA synthesis and amplification
2. End-repair of cDNA molecules
3. dA-tail of cDNA molecules
4. Adapter ligation
5. Sequencing
6. Base-calling
cDNA synthesis
Our cDNA synthesis protocol involves a customized version of the Smart-seq protocol1. The protocol is based on the terminal deoxynucleotidyl transferase activity of the wild-type MMLV (Moloney murine leukemia virus) reverse transcriptase2.
Preparation of Master Mixes
1. Thaw and vortex all reagents and keep master mixes on ice until use.
2. Label three 1.5 ml eppendorf tubes: “pre-RT”, “RT”, “PCR”
3. Always use fresh TSO primer as it is prone to degradation.
4. Prepare the “pre-RT mix” according to Table 1 below.
Table 1: pre-RT mix
pre-RT mix Total RNA (µl /sample)
1 ERCC RNA Spike-In Mix 1 x
2 RNase Inhibitor (40 U/µL * 125 µL = 5000U) 0.05
3 Poly-T primer (stock: 12 µM) 0.7
4 Superscript IV first-strand buffer (5×) 0.4
5 Nuclease free water 0.19
6 dNTP Mix (stock: 10 mM each) 0.56
Total = 2
5. Pipette 2 µL of pre-RT mix to a PCR tube and add 1uL of sample (300 ng of total RNA). Include a negative control (1 µL of water/RNA buffer).
6. Incubate the samples in a thermocycler set according to Table 2 below.
Table 2: pre-RT incubation
Temperature Time Purpose
72°C 3 min Unfolding of RNA secondary structures, Poly-T primer binding
4°C 10 min Poly-T primer binds
25°C 1 min Poly-T primer binds more specifically
4°C Hold
7. Prepare the “RT mix according to Table 3
RT mix µl /sample
1 Nuclease free H20 0.85
2 Superscript IV first-strand buffer (5×) 0.8
3 DTT (stock: 100 mM) 0.175
4 TSO (stock: 12 μM) 0.7
5 RNAse inhibitor (stock: 40 U/ μl) 0.175
6 SuperScript IV reverse transcriptase (stock: 200 U/ µl) 0.35
7 Betaine (stock: 5 M) 0.7
8 MgCl2 (stock: 100 mM) 0.25
Total = 4
8. Following pre_RT incubation, add 4 µl of RT mix to each sample, mix and briefly spin down.
9. Incubate the samples in a thermocycler set according to Table 4 below
Table 4: SSIV RT protocol
Temperature Time Cycle Purpose
50°C 10 min 1 RT and template-switching
55°C 30 sec 10 Unfolding of RNA secondary structures
50°C 30 sec Completion/continuation of RT
60°C 30 sec 5 Unfolding of RNA secondary structures
55°C 30 sec Completion/continuation of RT
50°C 30 sec 1 Finish template switching
65°C 30 sec 5 Unfolding of RNA secondary structures
60°C 30 sec Completion/continuation of RT
50°C 30 sec 1 Finish template switching
70°C 30 sec 5 Unfolding of RNA secondary structures
65°C 30 sec Completion/continuation of RT
50°C 30 sec 1 Finish template switching
75°C 30 sec 5 Unfolding of RNA secondary structures
70°C 30 sec Completion/continuation of RT
50°C 1 min 1 Final finish template switching
80°C 10 min 1 Enzyme inactivation
4°C Hold 1
10. Prepare the PCR master mix according to Table 5
Table 5: PCR master mix
PCR Mix (µl per 7 ul of RT reaction)
1 PCR-Grade Water 47.6
2 10X Advantage 2 PCR Buffer (Advantage 2 PCR Kit) 7
3 50X dNTP Mix (Advantage 2 PCR Kit) 2.8
4 PCR primer (stock: 12 μM) 2.8
5 50X Advantage 2 Polymerase Mix (Advantage 2 PCR Kit) 2.8
Total = 63
11. Following RT incubation, add 63 µl of PCR mix to each sample, mix and briefly spin down
12. Incubate the samples in a thermocycler set according to Table 6 below
Table 6: PCR protocol
Temperature Time Cycle
95°C 1 min 1
95°C 20 sec 5
58°C 4 min
68°C 6 min
95°C 20 sec 11 or 12 cycles , aim for ~1-2 µg of cDNA per 70 µl of PCR amplification reaction
64°C 30 sec
68°C 6 min
72°C 10 min 1
4°C Hold 1
13. The amplified product is subsequently cleaned with Agencourt AMPure XP beads as is described below.
Agencourt AMPure XP cleanup of cDNA amplification products
a. Allow AMPure XP beads to equilibrate to room temperature for at least 30 minutes.
b. Vortex the beads until evenly mixed, then add 0.9X sample volume of Agencourt AMPure XP beads to the sample in the same tube as used for PCR.
c. Pipet the entire volume up and down to mix thoroughly. Place the sample tubes on a roler mix for 5 - 8 minutes to let the DNA bind to the beads. Briefly spin the samples to collect the liquid from the side of the tube.
d. Place the sample tubes on the magnetic separation device for ~2 minutes until the liquid appears completely clear, and there are no beads left in the supernatant.
e. While the samples are on the magnetic separation device, pipette out the supernatants. Keep the samples on the magnetic separation device. Add 200 μl of freshly made 80% ethanol to each sample without disturbing the beads. Wait for 30 seconds and carefully pipette out the supernatant containing contaminants.
f. DNA will remain bound to the beads during the washing process. Repeat step 4 once more. Briefly spin the samples to collect the liquid from the side of the wall.
g. Place the samples on the magnetic device for 30 seconds, then remove all the remaining ethanol with a pipette.
h. Place the samples at room temperature until the pellet appears dry (~ 5 minutes). You may see a tiny crack in the pellet when it is dry.
i. Once the beads are dry, add 51 μl of TE buffer to cover the bead pellet.
j. Remove the samples from the magnetic separation device and mix thoroughly to resuspend the beads. Incubate the sample with rotation at room temperature for 5 – 8 minutes.
k. Put the tubes on the magnet and after ~2 minutes recover the supernatant which should contain the cleaned amplified cDNA. Determine the quantity of the cDNA and profile using Qubit HS DNA Assay Kit and Agilent D5000 Tapestation, respectively, following manufacturer instructions.
End -repair of DNA
End repair of 1 µg of amplified cDNA is carried out using NEBNext End Repair Module (New England Biolabs, E6050S) following manufacturer instructions. This is followed by 0.9X Ampure XP beads cleanup (described above).
dA-tailing reaction
d(A) tailing of the recovered end-repaired cDNA is carried out using NEBNext dA-Tailing Module (New England Biolabs, E6053S) following manufacturer instructions. This is followed by 0.9X Ampure XP beads cleanup (described above).
Adapter ligation
Ligation of ONT sequencing adapters onto recovered d(A)-tailed cDNA (up to 1 µg) is carried out following ONT SQK-LSK-108 protocol. However, you can increase the incubation time from 10 minutes to 1 - 4 hours at room temperature.
ONT MinION sequencing kit
ONT SQK-LSK-108 protocol is followed for the sequencing part.
Basecalling
Basecalling can be done off-line using Albacore from ONT
Data analysis example commandline arguments
Basecalling
Albacore (ONT, version 2.0.2)
read_fast5_basecaller.py -r --flowcell SQK-LSK108 --kit SQK-LSK108 --input %s --save_path %s --worker_threads 23 -o fastq" %(input_dir,save_path))
Minionqc3 (version 1.0)
Rscript ~/MinionQC.R -p 23 -i $(‘pwd’)/files -o $(‘pwd’)/results
Pauvre (version 0.1.2, https://github.com/conchoecia/pauvre)
pauvre marginplot --no-transparent --fastq ../Bo_E_1H_C010_10_pass.fastq > pauvre.out 2> pauvre.out
Porechop (version 0.2.3, https://github.com/rrwick/Porechop)
~/porechop --format fasta -t 47 -i $read5.fasta -o $read5.choped.fasta > porechop.stdout 2> porechop.stdout
Cutadapt4 poly(A) trimming from read ends (version 1.15)
~/.local/bin/cutadapt --info-file=trim_info -f fasta -a "A[100]" -o $read2.cutadapt.fasta $read2.fasta
GMAP5 (version 2018-03-25)
GMAP for alignment QC
~/gmap -t 23 -D $dirc -f samse -d $ref $read1 > $outsam.sam
GMAP for transcriptome assembly
~/gmap -t 23 -D $dirc --cross-species --max-intronlength-ends=10000 -n 1 -z sense_force -f samse -d $ref $read1 > $outsam1.sam 2> gmap.stdout
Minimap26 (version 2.9 (r720))
~/minimap2 -ax splice -t 23 $ref $reads1 > $outsam1.sam
Samtools7 (version 1.3.2)
AlignQC8 (version 1.2)
~/alignqc analyze $outsam1.sort.bam --specific_tempdir $dirc/tmp1 -r $ref -a $annotation -o alignqc.xhtml --output_folder $dirc/alignQC.ouput_b4_correction > alignqc.stdout
Canu9 (Canu 1.7)
canu useGrid=false -correct gnuplotImageFormat=png corOutCoverage=10000 corMhapSensitivity=high corMinCoverage=0 correctedErrorRate=0.16 overlapper=minimap ovsMethod=sequential minReadLength=200 minOverlapLength=100 genomeSize=1500000000 -p Bo_E_all_pass_edited -d Bo_E_all_pass_edited -nanopore-raw Bo_E_all_pass_edited.fasta
LoRDEC10 (v0.8, using GATB v1.4.1)
~/lordec-correct -2 $illumina_reads -T 47 -p -k 19 -s 3 -i $nanopore.fasta -o "$nanopore"_lordec_corrected.fasta
GFOLD11 (v1.1.4)
gfold diff -norm NO -s1 Bo.E.2H -s2 Bo.E.1H -suf .abs_cnt3 -o Bo.E.2HvsBo.E.1H.abs.diff > Bo.E.2HvsBo.E.1H.abs.diff.stdout
cDNA_Cupcake (version 5.3, https://github.com/Magdoll/cDNA_Cupcake/wiki)
~/collapse_isoforms_by_sam.py --input $read1 -s $outsam.sorted.sam --dun-merge-5-shorter -o $pref
~/filter_by_count.py $pref.collapsed --min_count=2 >filter_by_count.stdout
~/filter_away_subset.py $pref.collapsed >filter_away_subset.stdout
~/filter_away_subset.py $pref.collapsed.min_fl_2
cDNA_Cupcake for assembly evaluation using 5-hour timepoint
~/collapse_isoforms_by_sam.py -c 0.95 -i 0.95 --input $read1 -s $sortedsam --dun-merge-5-shorter -o $pref
TAMA (version tc0.0, https://github.com/GenomeRIK/tama)
~/tama_collapse.py -d merge_dup -s $sortedsam -f $ref -p $pref -x no_cap -c 95 -i 95
TAPIS12 (1.2.1)
alignPacBio.py -p 22 -v -K 10000 -o tapis_output $indexesDir $indexName $reference $reads
run_tapis.py -p -t 30 -o run_tapis_output $annotation tapis_output/$bamfile
SQANTI13 (version 1.2)
sqanti_qc.py -z -t 47 -fl $fl_abundance -c $sj_covIllumina -e $isoExpression -x $gmapindex -o $output -d qc_output $isoforms.fa $gtf $ref
sqanti_filter.py -d filter_output -i "$isoforms"_corrected.fasta "$output"_classification.txt