The full procedure is divided into four distinct protocols, concerning (A) the collection of the bee samples; (B) the processing and management of the samples; (C) the pathogen and parasite assays, and (D) the genetic identification of the bee species. Many different types of data are collected throughout the procedure, some for use in data analysis and others as intermediate stages in the conversion and normalization of the raw data into processed data that is suitable for use in statistical analyses. A metafile describing in detail each of these primary and derived parameters is given as a Supplementary Table at the end of the procedure.
A. SAMPLE COLLECTION PROTOCOL
The field sampling protocol below is based on honey bees (Apis mellifera) as a driver of pathogen distribution in wild bees, but this can be substituted for any other biological driver, e.g. a particular species of bumblebee. The speed at which the bees are collected, recorded by time, can be used as a proxy for the absolute density of bees in the area. The order in which different bee species are collected (up to their maximum) can be used as a proxy for the relative density of the different bee species. Bee foraging behavior is affected by weather conditions43 and species of flower44-46, and differs furthermore between bee species44-49. Since bee pathogens can be transmitted through floral visitation networks11-15, and are also directly affected by temperature50, weather conditions and the floral character of the landscape are important metadata to collect.
1. Sample collection:
a) Collect 30 individual honey bees (Apis mellifera) and 30 individual wild bees, in the order in which they are encountered
b) Collect 15 additional specimens of the commonest wild bee species
c) Collect all samples on the same day during high flight activity of bees (above 15°C and less than 80% cloud cover) in spring/summer (April-August in the Northern Hemisphere).
d) Collect all bees within the same 100 m x 100 m flower-rich area from flowers
e) If a particular bee species is at a high density on one patch of flowers, collect a maximum of five individuals from that patch then move to collect from other patches within your single 100 m x 100 m area
f) Place each insect individually into a tube, labelled in the order in which the bees were collected. Place the tube on ice.
g) nStore tubes the same evening in a -80 °C freezer for downstream molecular analysis
2. Data collection:
For each bee, record the following:
a) Time of collection
b) Species of bee (field ID, to be confirmed afterwards by microscopy or barcoding)
c) Gender of the bee (can be determined afterwards)
d) The plant species whose flower the bee was on (can be determined afterwards)
e) Temperature in the open; temperature in the shade
f) Cloud cover
g) Longitude of the location; latitude of the location
h) Surrounding landscape type (can be determined afterwards)
i) General floral density and diversity within the 100 m x 100 m area
B. SAMPLE PROCESSING PROTOCOL
The sample processing is designed to allow the following two aims:
To determine the prevalence of known (honey)bee pathogens across bee species
To identify and characterize yet unknown, novel pathogens in wild bees
The strategy for both aspects is to prepare a primary homogenate in a neutral buffer, extract nucleic acids from a small amount of extract for the “current-pathogen” analyses (aim 1) and retain the rest for “new-pathogen” prospecting analyses (aim 2), which may involve additional steps prior to nucleic acid extraction.
There is strong emphasis in the sample collection protocol on minimizing cross-contamination between insects, in order to avoid misclassifying the pathogen status of individual bees. Aside from sampling artefact, the larger purpose of this is to attempt to distinguish between those microbial agents that are infectious to the bee tissues, and are therefore a potential health threat; those that are part of the bee microbiome, both internal and external, and those that are passively associated with the bee but not infectious to bee tissues. We have therefore also included several simple processing and assay strategies that maximize our ability to distinguish between infectious and non-infectious agents. One means to do this is to separate the body parts of the bees:
Abdomen, where nearly all internal bee pathogens replicate, in the bee tissues, and shed their
replicative propagules (spores/oocysts/virus particles etc.) in the gut lumen for voiding into the
environment, with the faeces. The abdomen also contains passively acquired, non-infectious
agents, both internally in the gut and externally on the exoskeleton.
Head, where many of the viral pathogens actively replicate, and occasionally shed particles into the
salivary and hypopharyngeal glands, especially in honey bees.
Thorax, containing few internal (or external) microbes.
Wings & Legs, containing pollen baskets. Pathogens can be shared between bees through flower-
visitor networks, especially on pollen.
Since the abdomen contains by far the highest concentrations of pathogens, there is little loss of detection sensitivity from excluding the head, thorax and legs-wings, which can be retained separately for other studies. For example, the DNA/RNA from legs, wings and thoraxes contain very little contaminating microbial nucleic acids and are therefore optimally suited for possible host bee genetic analyses. In order to allow for the widest possible types of additional, future analyses on the material, it is best to prepare the primary homogenate in a neutral, aqueous buffer and only extract RNA and DNA from an aliquot of primary homogenate, with the remainder stored at -80 °C. There is no loss in detection sensitivity of the pathogens (or host mRNAs) from such a neutral primary extract as long as the extract is either frozen or added to the nucleic acid extraction buffers within 5 minutes. There are also several options at the assay level to distinguish active and passive infections, vide infra. Such measures are not entirely fool proof, but greatly improve the chances of finding truly infecting pathogens.
The protocols described below are loosely based on the COLOSS BeeBook chapters “Standard methods for virus research in Apis mellifera”18 “Standard methods for research on Apis mellifera gut symbionts”37 and “Standard methods for molecular research in Apis mellifera”17.
3. Bee size estimation:
a) Remove from the freezer the number of bees needed for a single extraction run
b) Record the weight of each bee on a fine scale (in mg)
c) Using a fine caliper, record the inter-tegular distance (ITD) of each bee, i.e. the distance across
the thorax between the insertion points of the wings (to the nearest 0.1 mm)
4. Dissection:
a) Perform the dissections on a frozen (carbo-ice) dissection plate, if possible
b) Separate Head, Thorax, Wings/legs from the Abdomen using a sterile scalpel/razor blade and
place each in a separate marked container
c) The scalpel can be sterilised between bees by dipping in ethanol and burning off the ethanol
with a flame
d) Store the H, T, W and A sub-samples at -80 °C until further use
5. Buffer preparation:
a) Sterile TBS buffer: 50 mM TRIS.HCl pH7.4, 150 mM NaCl
b) Lysozyme Lysis buffer: 20 mM TRIS.HCl pH8.0, 2 mM EDTA, 1.2% Triton X100, 20 mg/mL
Lysozyme
c) TBS/RNA250/pJET buffer: On the day of use, add to TBS buffer (see above) RNA250 to a final
concentration of 10 ng/mL and pJET1.2 to a final concentration of 1 ng/mL, based on the
concentrations given in the product sheets, as passive RNA and DNA reference standards, for
posterior normalization of the data for methodological differences between individual samples in
homogenization and nucleic acid extraction efficiency
d) The amount of TBS/RNA250/pJET buffer needed for an individual bee abdomen is:
Bumble bee sized abdomen 800 µL TBS/RNA250/pJET
Honey bee sized abdomen 500 µL TBS/RNA250/pJET
Small sweat bee sized abdomen 200 µL TBS/RNA250/pJET
e) Orchard and mason bees would be similar in size to honey bees; others more like sweat bees,
so use your judgement and record which actual volume you used for each bee
6. Homogenization:
a) CRITICAL: work rapidly and at 4 oC until the nucleic acids are extracted and frozen
b) Homogenize each abdomen in the TBS-RNA250/pJET buffer:
If a bead-mill (e.g. the Qiagen TissueLyser II or equivalent) is available, add three 3 mm steel
beads and one 5 mm steel bead to the sample and buffer in an appropriate tube for bead-
milling, and shake for 2 minutes at 30 Hz followed by 2 minutes at 20 Hz
If no bead-mill is available, use disposable micro pestles to grind the abdomen by hand in the
TBS-RNA250 buffer
c) Centrifuge briefly to pellet the exoskeleton and transfer the supernatant homogenate to a clean
2 mL screwcap storage tube and store immediately at -80°C
d) Repeat until enough bees have been processed for a single extraction run
7. RNA & DNA extraction:
a) The RNA extraction follows the Qiagen Plant RNeasy protocol
b) The DNA extraction follows the Qiagen Blood and Tissue DNA kit protocol for Gram-positive
bacteria, which includes Paenibacillus (AFB), Melissococus (EFB), the lactic acid bacteria and
many other bee gut bacteria
c) The main feature of the DNA protocol is a long lysozyme/proteinase-K incubation prior to DNA
purification, to digest the bacterial cell walls and the spores, oocysts and propagules of
bacteria, microsporidia, and various other bee eukaryotic pathogens
d) RNA and DNA extractions can be performed in parallel, in batches of 12
e) The DNA samples will be incubating for about 1 hour while you purify the RNA samples, after
which you can purify the DNA samples
f) For the RNA extraction, prepare 12 microcentrifuge tubes with 350 µL RLT buffer (from the
Qiagen Plant RNeasy purification kit) to which 1% beta-mercaptoethanol has been added on
the day of use.
g) For the DNA extraction, prepare 12 microcentrifuge tubes with 180 µL Lysozyme Lysis buffer (see
above; do NOT use the buffer provided in the kit!)
h) Thaw enough primary homogenates for a single run of RNA and DNA extractions (usually 12) and
store on ice (4 oC)
i) CRITICAL: Work as quickly as possible while the homogenates are thawed, and return
immediately to frozen storage (< -20 oC) when no longer needed.
j) Briefly mix each homogenate and add 100 µL to a tube for RNA extraction (RLT buffer) and 100
µL to a tube for DNA extraction (Lysozyme Lysis buffer)
k) CRITICAL: Freeze the remaining homogenate immediately after mixing the two 100 µL aliquots to
the RNA and DNA extraction buffers
l) Proceed to the next homogenate, until all homogenates in the extraction run have aliquoted 100
µL each to an RNA and a DNA extraction tube
8. DNA purification:
a) Incubate the DNA extractions for 30 minutes at 37°C
b) Add 25 μL Proteinase-K solution (supplied in the Blood and Tissue kit) and mix
c) Add 200 μL buffer AL (excluding ethanol; supplied in the Blood and Tissue kit) and mix
d) Incubate 30 minutes 56°C
e) Add 200 μL ethanol and mix thoroughly
f) Transfer all onto the DNeasy Mini spin column nested in a 2 ml collection tube
g) Follow the DNeasy protocol, either manually or with a robot
h) Elute the DNA in either 100 μL (bumblebee), 70 μL (honeybee) or 50 μL (sweat bee) AE buffer
(supplied in the Blood and Tissue kit)
i) Determine the approximate yield and purity of the DNA, using a NanoDrop or similar instrument
j) Adjust the DNA concentrations of all samples to 20 ng/µL by adding AE buffer (supplied in the
Blood and Tissue kit)
k) Store the DNA at -80°C until further use
9. RNA purification:
a) Meanwhile, for the RNA extractions, follow the Plant RNeasy protocol, either manually or with a
robot, while the DNA extractions are incubating
b) Elute the RNA in either 50 or 30 μL ultra-pure sterile water, depending on the size of the bee
c) Determine the approximate yield and purity of the RNA using a NanoDrop or similar instrument
d) Adjust the RNA concentrations of all samples to 100 ng/µL by adding sterile water (supplied in the
RNA extraction kit)
e) Store the RNA at -80 °C until further use
10. cDNA synthesis:
a) Prepare first strand cDNA with random hexamer primers from 1 µg RNA (10 µL) in a 20 µL volume
using a standard first strand cDNA kit containing the M-MLV reverse transcriptase and an
RNAse inhibitor, following the manufacturers recommendations
b) After the final incubation, dilute the cDNA 10-fold in sterile water and store at -80°C until further
use.
C. PATHOGEN ASSAYING PROTOCOL
The presence and abundance of a range of bee microorganisms, whether detrimental or beneficial, is determined by quantitative PCR of the cDNA and DNA templates using broad-range primers (i.e. those encompassing several strains or species within a complex). The reason for this is in part efficiency (fewer assays to run) and in part to avoid false negative results due to assay insufficiencies (Type-II errors). Type-I errors (false positive results) are mostly due to trace contamination and are easily avoided by restricting the number of amplification cycles to 35, rather than 4017,42,51 For accurate quantification we recommend buying synthetic external quantification standards (e.g. ThermoFisher), based on the product sequences (Supplementary Information 3), rather than home-made standards from either purified plasmid clones of the fragment, or purified PCR product, primarily to guarantee uniformity of standards and quantification between different labs, as well as better absolute quantification with synthetically produced, and accurately quantified, standards. A key element of these quantitative assays is the use of the passive external reference nucleic acids, RNA250 and pJET1.2. These were added in exact known quantities at the start of the homogenization and nucleic acid extraction protocol and the amounts of these left in each cDNA and DNA template can be measured exactly through qPCR, similar to how the pathogen amounts are measured. The ratio of RNA250/pJET measured by qPCR in a standard template volume to the known amount originally added prior to extraction is therefore a simple, one-step conversion factor for all the individual, sample-specific methodological errors and losses incurred from homogenization through qPCR, for accurate absolute quantification of the amounts of each target in the original bee. It will reduce, if not eliminate, random methodological noise from the dataset and improve the chances of detecting true biological differences between the samples. Another key feature is that the qPCR assays are all designed to work with the same thermocycling profile, to enable different assays to be run in the same thermocycling run, and produce similar medium-sized PCR products, to facilitate distinguishing true product from illegitimate secondary products through either Melting Curve analysis or agarose gel electrophoresis.
11. qPCR amplification:
a) Add 2 µL diluted DNA or cDNA template to 18 µL qPCR reaction mixture (e.g. BioRad EvaGreen)
containing 0.2 µM each of forward and reverse primers for the assay being run (Supplementary
Table 1)
b) Include in each run a 7-step ten-fold dilution series of a (preferably synthetic) positive control of
known concentration, as well as a template-free negative control (e.g. ddH2O)
c) Amplify the target with the manufacturer’s recommended thermocycling profile for a ~500 bp
fragment, a fluorescent dye-based detection chemistry (e.g. SYBR-green, EvaGreen), for
primers with a melting temperature of 58 °C, and for no more than 35 cycles, e.g. 95 °C:30sec +
35x[95 °C:15sec – 58 °C:20sec – 72 °C:30sec - read] + 72 °C:60sec
d) Following the amplification stage with a Melting Curve (MC) analysis profile for confirming product
identity, e.g. reading the fluorescence at 0.5 °C intervals from 55 °C to 95 °C
e) Confirm the absence of amplification in the template-free negative control
f) Discard all positive results whose MC profile and peak does not match to within 1.0 °C of the MC
profiles and peaks for the positive controls
12. Absolute quantification and passive reference normalization:
a) Obtain for each sample-assay combination the SQ (Starting Quantity) value, estimated by plotting
the Cq values against the known amounts (preferably as ‘copy number’) of the synthetic
external reference standards
b) Calculate the total number of copies of RNA250 and pJET1.2 molecules added to each abdomen
prior to extraction, from the volume of TBS-RNA250-pJET buffer added and the molecular
weights of RNA250 (641991 g/mol) and pJET1.2 (1837503 g/mol)
c) For each sample, calculate the RNA250out/RNA250in ratio by dividing the sample’s individual SQ
value for its RNA250 amplification (RNA250out) by the calculated amount of RNA250 added to
the abdomen prior to extraction (RNA250in)
d) Divide each sample’s SQ values for the different RNA pathogens (cDNA amplifications) by the
sample’s RNA250out/RNA250in ratio, to obtain the estimated total amount of pathogen RNA in
the original abdomen
e) Follow the same procedure for the pJETout/pJETin ratios and the DNA pathogen amplifications
D. BARCODE SPECIES IDENTIFICATION PROTOCOL
Some bees may be difficult to identify in the field, either because they are cryptic (i.e. morphological features that overlap with other bee species), dirty or damaged. Occasionally a species may lack a morphological identification key. In these cases, it is usually possible to identify specimen through DNA barcode analysis, i.e. by sequencing one of several well-established barcoding genes in the bee nuclear or mitochondrial genomes. Most animals, particularly insects, are barcoded using a c.a. 650 bp fragment of the mitochondrial Cytochrome Oxidase I (cox1) gene, which is sufficiently variable to be able to uniquely distinguish even closely related species25,26. A critical requirement is that the DNA template is sufficiently pure to only amplify the cox1 region of the bee, and not of any contaminating DNA from eukaryotic parasites or plants. This effectively rules out using the purified DNA from bee abdomens for use in barcoding analysis, since this DNA consists of the genomes of all organisms residing in the bee gut, as well as some of the bee. The simplest approach is to amplify the cox1 directly from a snippet of leg, wing or thorax tissue27. Each bee cell contains many mitochondria, each of which contains numerous copies of the mitochondrial genome52,53, so that even the tiniest trace amount of tissue contains thousands of copies of the mitochondrial genome, more than enough for PCR amplification. In fact, there is greater danger that too much template is added to the PCR reaction, which can inhibit the PCR and the production of sufficient PCR product for sequencing51.
13. Cox1 barcode amplification:
a) Aliquot 10 μL PCR master mix containing 0.2 mM dNTP and 0.4 µM each of primers LCO-1490 (GGTCAACAAATCATAAAGATATTGG) and HCO-2198 (TAAACTTCAGGGTGACCAAAAAATCA)
b) Add with sterile forceps a snippet of a bee wing or a leg tarsa, or poke a disposable 10 μL pipette tip in the bee thorax and stir in the buffer
c) Enough DNA will transfer for amplification
d) Amplify the cox1 gene with the manufacturer’s recommended thermocycling profile for a ~650 bp fragment, for primers with a melting temperature of 51 °C, and for no more than 35 cycles, e.g. 95 °C:30sec + 35x[95 °C:15sec – 51 °C:15sec – 72 °C:30sec] + 72 °C:60sec
e) Run 1 μL PCR product on an agarose gel to check the product size and absence of secondary products
f) Submit the PCR product for sequencing by any of a number of commercial sequencing companies
g) Most bees can be barcoded with these primers but some species are difficult to amplifying at the cox1barcode with these primers54
h) If the cox1 barcode of a particular bee does not amplify with these primers54, then try an alternative forward primer, LCO-1790 (GCTTTCCCACGAATAAAATAATA), which can be used26 together with HCO-2198 (TAAACTTCAGGGTGACCAAAAAATCA) primer
i) Amplify the shorter cox1product using the manufacturer’s recommended PCR protocol for a ~400 bp fragment, for primers with a melting temperature of 54 °C, and for no more than 35 cycles, e.g. 95 °C:30sec + 35x[95 °C:15sec – 54 °C:15sec – 72 °C:30sec] + 72 °C:60sec
14. Cox1 barcode sequencing & analysis:
a) OPTIONAL: To ascertain whether the amplification has been successful or not, run 1 μL of each PCR reaction (or a representative selection of PCR products) on a 0.8% agarose gel in a suitable TRIS/EDTA buffer system, including a nucleic acid binding dye, as recommended by a standard molecular laboratory manual55, and visualize the bands alongside a molecular weight marker.
b) If the cox1 barcode amplifications have been successful, submit the reactions to a commercial sequencing service together with the corresponding LCO and HCO amplification primers for sample purification and Sanger sequencing of both strands of each DNA template
c) Check the quality of the sequence produced (usually included in the sequencing results)
d) Combine the forward and reverse sequences into a single consensus sequence for each sample, using any of a number of public or proprietary sequence analyses programmes (e.g. SeqScanner, 4peaks, Lasergene, Geneious)
e) Submit your consensus sequence to the BOLD database (www.boldsystems.org) for species identification of your sample
f) If you have a shorter sequence, search for a possible match in the GenBank database (www.ncbi.nlm.nih.gov) using the BLASTn programme