Plants have emerged as a viable alternative to microbial fermentation and mammalian cell culture for the industrial production of biopharmaceuticals. They offer scalability and low production costs and are free of human pathogens and toxins (Fiedler et al., 1997; Evangelista et al., 1998; Hood et al., 2002; Ma et al., 2003; Stoger et al., 2005; Verma et al., 2008; Orzáez et al., 2009; Woodard et al., 2009; Dreesen et al., 2010). In addition, plant enzymatic machinery can provide a method of biosynthesis, folding and assembly of complex proteins and promote proper post-translational modifications, resulting in high quality recombinant products (Ma et al., 2003).
The genetic transformation of a broad range of vegetable species and the efficient production of more than 100 different recombinant proteins using plant systems has confirmed the potential to utilise transgenic plants as efficient bioreactors of several relevant biomolecules, such as antigens, vaccines, antibodies and growth factors (Perrin et al., 2000; Ma et al., 2003; Stoger et al., 2005).
As a potential, low-cost platform for recombinant protein biosynthesis, soybeans [Glycine max L. (Merril)] might constitute one of the least expensive systems for the large-scale production of biopharmaceuticals. Their high biomass capacity, short life cycle, low allogamy frequency and photoperiod sensitivity are intrinsic and unique characteristics (Abud et al., 2003; 2007; Cunha et al., 2010b). Under greenhouse conditions, utilising a photoperiod of 19 h of light can increase soybean seed production 10-fold when compared with seed production under field conditions as a consequence of an extended vegetative growth and flowering delay (Cunha et al., 2010b).
Soybean seeds are a rich source of protein, which can reach up to 40% of the dry weight of these organs (Cantoral et al., 1995; Moravec et al., 2007; Boothe et al., 2010), and, unlike leaves, they can be easily transported and stored for extended periods without significant protein degradation and do not require special storage conditions (Leite et al., 2000; Stoger et al., 2005; Moravec et al., 2007; Lau & Sun, 2009; Cunha et al., 2010a).
Because seeds generally express a wide range of genes related to abundant endogenous storage proteins, it is advisable to use a seed-specific promoter with high specificity and strong transcriptional activity to maximise transgene expression and reach economic scalability for the production of recombinant proteins (Stoger et al., 2000, 2005). Monocot promoters, such as the commonly used rice glutelin GluA-2 (Gt-1), rice globulin, barley D hordein, and maize zein promoters, are usually employed to drive the expression of genes related to storage proteins that are localised in the endosperm, the main storage tissue of monocot seeds. For this reason, expression cassettes using monocot promoters are specifically optimised for gene expression in monocot seeds, resulting in expression levels reaching up to 15% of the total seed protein in transgenic seeds (Takagi et al., 2005, Yang et al., 2006 and Yang et al., 2007).
Among the dicot promoters, legume promoters, such as the seed-specific promoters from the Phaseolus vulgaris phaseolin and arcelin-5 genes, soybean lectin, glycinin, the β-conglycinin α' subunit, pea legumin (legA) and the bean unknown seed protein (USP), have emerged as a suitable choice for recombinant protein production in dicot seeds (Boothe et al., 2010). In a study that examined the expression of a functional, active single chain antibody in Arabidopsis seeds, impressive results were achieved when both Phaseolus promoters were used, accumulating up to 36 times more immunoglobulin than was observed in transgenic Arabidopsis seeds that were transformed with the CaMV 35S promoter (De Jaeger et al., 2002).
Following mRNA translation, the accumulation of newly synthesised proteins in seeds is particularly sensitive to subcellular targeting and, for certain proteins, it can have a major effect on the final yields of the recombinant product by avoiding unintended proteolysis and achieving accumulation levels suitable for economic production (Ma et al., 2003; Stoger et al., 2005; Streatfield, 2007; Semenyuk et al., 2010; Boothe et al., 2010). For example, protein targeting to the secretory pathway components can lead to a considerable increase in the final protein yield, which can be as much as 10-fold higher than the accumulation in the cytosol alone (Conrad & Fiedler, 1998; Avesani et al., 2003).
The most utilised targeting approaches are to retrieve newly synthesised proteins in the endoplasmic reticulum (ER) by adding N- or C-terminal signal peptides to the nascent proteins or to direct them to other subcellular organelles, such as the mitochondria, the chloroplasts, and the many components of the secretory pathway, especially the main sites for protein accumulation in seeds, namely, the protein storage vacuoles (PSVs) (Stoger et al., 2000; Jiang & Sun, 2002; Stoger et al., 2005; Vitale & Pedrazzini, 2005; Boothe et al., 2010).
Plant vacuoles are the intracellular endpoints of the plant secretory pathway, the final destination of proteins bypassing the ER and travelling through the Golgi complex (Jolliffe et al., 2005). They can be divided into two main categories: (i) LVs (lytic vacuoles), which are acidic compartments that are rich in hydrolases and can be regarded as the equivalent of mammalian lysosomes, and (ii) PSVs, the ER-derived cisternae, which are found in the storage organs, mainly seeds, and are the main specialised storage sites of large amounts of protein that will later be used as a source of aminated compounds during seed germination (Yoo & Chrispeels, 1980; Vitale & Hinz, 2005; Jolliffe et al., 2005). Cotyledonary PSVs constitute an excellent subcellular target for the long-term storage of recombinant proteins as their lumen are characterised by a non-acidic environment with low concentrations of amino peptidases, thus minimising protein degradation (Zheng et al., 1992; Muntz, 1998; Jolliffe et al., 2005; Takaiwa et al., 2007; Cunha et al., 2010c).
PSVs also contain large amounts of the major seed storage globulins 7S and 11S and toxic proteins, such as lectins, protease inhibitors and ribosome inactivating proteins, which probably evolved to protect the seeds from predators (Herman & Larkins 1999; Vitale & Hinz 2005).
The mechanism related to the transport of recombinant proteins to the PSVs has not been well studied compared to that of the LVs (Nishawa et al., 2003). However, the addition of N-terminal signal peptides without any additional signal leads the protein of interest to be targeted to the ER, followed by further delivery along the secretory pathway and to the PSVs and apoplast (Vitale & Pedrazzini, 2005).
Another model for protein targeting to plant vacuoles is based on recent advances in understanding the role of a restricted group of cargo proteins, named vacuolar sorting signals (VSS), that can mediate protein targeting to the PSVs (Jolliffe et al., 2005). VSSs of barley lectin, common bean phaseolin, and the soybean β-conglycinin α' subunit have already been identified as potential mediators for recombinant protein targeting to the PSV (Bednarek & Raikhel, 1991; Frigerio et al., 1998; Nishizawa et al., 2003; Robinson et al., 2005). Although they are sophisticated targeting tools, few studies have reported the interference of the high flow of seed storage proteins to the PSVs in the correct targeting mediated by VSSs, which resulted in the unintended accumulation of proteins in non-targeted organelles (Lau & Sun, 2009).
The beta-conglycinins (7S) and the glycinins (11S) are the major storage proteins of the soybean, which can account for up to 70% of the total seed protein (Chen et al., 1986; 1988; Nishizawa et al., 2003). Beta-conglycinin is a multimeric protein and consists of three subunits: alpha prime (alpha’), alpha and beta (Derbyshire et al., 1976; Chen et al., 1986; Doyle et al., 1986). The alpha and alpha’ subunits are synthesised as pre-proproteins and the beta subunit as a pre-protein (Nishizawa et al., 2003). The expression of these subunit genes is spatially and temporally regulated, coinciding with the development of the seed and resulting in high accumulation levels of the related proteins during seed maturation.
In this work, we describe results showing a highly efficient, reproducible system for recombinant protein expression in seeds that targets the PSVs, where several molecules with different molecular weights and structures, such as the human Growth Hormone (hGH) (Cunha et al., 2010b), the human coagulation factor IX (hFIX) (Cunha et al., 2010c), two single chain fragment variable (scFVDIR83D4 and anti-CD18), and a potent microbicide against HIV, Cyanovirin-N (CVN), were stably accumulated in the PSVs from transgenic soybean seeds using the α’ subunit regulatory sequences of the soybean beta-conglycinin. We successfully utilised two seed-specific expression vectors of soybean construction, containing the following:
i) the promoter and signal peptide from the alpha’ subunit of beta-conglycinin (with step-by-step directions for how to construct it) (Fig. 1A);
ii) the promoter from the soybean alpha’ subunit of beta-conglycinin, along with the monocot signal peptide α-Coixin from Coix lacrima-jobi L. (responsible for directing the polypeptides to the PSVs of Coix lacrima-jobi L. seeds) (Ottoboni et al., 1993) (Fig. 1B).