The metabolite content in plasma samples of COVID-19 patient and control cohorts is shown in the Supplementary Data. The comparison of mean metabolite content in plasma samples of COVID-19 patients and the controls showed the statistically significant differences (Benjamini-Yekutieli test, P-value < 0.05) for 103 out of 140 metabolites studied (Supplementary Tab. S1).
Table 3. Analysis of KEGG metabolic pathway overrepresentation
Metabolite Set
|
Total
|
Hits
|
Expect
|
p-value
|
q-value, Holm P
|
q-value, FDR
|
Aminoacyl-tRNA biosynthesis
|
48
|
14
|
2.93
|
3.33E-7
|
2.8E-5
|
2.8E-5
|
Glycine, serine and threonine metabolism
|
33
|
10
|
2.02
|
1.29E-5
|
0.00107
|
5.41E-4
|
Arginine biosynthesis
|
14
|
6
|
0.855
|
8.95E-5
|
0.00734
|
0.00251
|
Valine, leucine and isoleucine biosynthesis
|
8
|
4
|
0.489
|
7.58E-4
|
0.0614
|
0.0159
|
Arginine and proline metabolism
|
38
|
8
|
2.32
|
0.00153
|
0.123
|
0.0258
|
Histidine metabolism
|
16
|
5
|
0.978
|
0.00195
|
0.154
|
0.0273
|
Pantothenate and CoA biosynthesis
|
19
|
5
|
1.16
|
0.00449
|
0.35
|
0.0538
|
Alanine, aspartate and glutamate metabolism
|
28
|
6
|
1.71
|
0.00558
|
0.429
|
0.0586
|
Phenylalanine, tyrosine and tryptophan biosynthesis
|
4
|
2
|
0.244
|
0.0204
|
1.0
|
0.191
|
Purine metabolism
|
65
|
8
|
3.97
|
0.0403
|
1.0
|
0.29
|
Pentose phosphate pathway
|
22
|
4
|
1.34
|
0.0409
|
1.0
|
0.29
|
Glyoxylate and dicarboxylate metabolism
|
32
|
5
|
1.96
|
0.0414
|
1.0
|
0.29
|
KEGG metabolic pathway overrepresentation analysis (Tab. 3) performed with Metaboanalyst websystem [24] enabled us to identify the pathways enriched with metabolites from the list of 103 significant ones (Supplementary Tab. S1). The numbers of significantly overrepresented pathways (q-value < 0.05) are 3 and 6, respectively, when Holm or FDR corrections for multiple comparisons are accounted for. All KEGG processes from this list were linked to the amino acid metabolism or the aminoacyl-tRNA biosynthesis.
We estimated the statistical significance of associations between the virus-host signaling pathways and KEGG metabolic processes (Tab. 4). Three overrepresented metabolic pathways, for which HOLM P-value is < 0.05 (Tab. 3), are analyzed here.
Table 4. Statistical significance of associations between virus-host signaling pathways and KEGG metabolic processes
Virus-host pathway template (Pi)
|
Aminoacyl-tRNA biosynthesis
(K1=40 proteins)
|
Glycine, serine and threonine metabolism (K2=34 proteins)
|
Arginine biosynthesis
(K3=18 proteins)
|
ki1
|
P-value
|
ki2
|
P-value
|
ki3
|
P-value
|
P1
|
3
|
0.092
|
0
|
-
|
0
|
-
|
P2
|
19
|
0.00011
|
11
|
0.26
|
10
|
0.0075
|
P3
|
3
|
0.23
|
0
|
-
|
2
|
0.99
|
P4
|
0
|
-
|
1
|
0.69
|
3
|
0.009
|
P5
|
5
|
0.981
|
10
|
0.049
|
10
|
0.004
|
P6
|
4
|
0.712
|
3
|
0.78
|
5
|
0.024
|
P7
|
19
|
0.0089
|
10
|
0.54
|
11
|
0.021
|
Note: Kj is the number of proteins in KEGG metabolic pathway j, kij is the number of proteins, which represent the targets of virus-host pathway template Pi (Tab. 2).
As one can see from (Tab. 4), five out of seven templates are significant for arginine biosynthesis. The regulation of a particular KEGG pathway by viral proteins can occur via different signaling pathways and can involve different types of interactions. Conversely, only template P5 is significant for glycine, serine and threonine metabolism pathway. The signaling pathways described by templates P2 and P7 can significantly contribute to the regulation of aminoacyl-tRNA biosynthesis. A typical feature of these pathways is that the last link in the chain of interactions is represented by protein-protein interaction involving KEGG enzyme. Such pathways affect the activity and stability of proteins via protein-protein interactions, but not the expression of KEGG pathway enzymes.
A detailed analysis of signaling pathways associated with each of KEGG metabolic processes is provided below.
Aminoacyl-tRNA biosynthesis
The first column in the table of KEGG metabolic pathway overrepresentation analysis guided by the identified set of metabolites, which significantly differ between COVID-19 and control groups, is occupied by aminoacyl-tRNA biosynthesis (Tab. 3). Overall, 48 metabolites are involved in the process, 14 of which are on the list of significantly differing metabolites (Supplementary Tab. S2).
Of them, ten metabolites are significantly increased in plasma samples of COVID-19 patients (L-Aspartic acid, L-Serine, L-Glutamic acid, etc.), while L-Asparagine, L-Tyrosine, L-Methionine и L-Leucine/L-Isoleucine contents are reduced there.
The reconstruction of signaling pathways potentially involved in the regulation of aminoacyl-tRNA biosynthesis by viral proteins was performed for the proteins localized to the mitochondria or the cytoplasm, separately (Supplementary Tab. S3). Of seven types of pathways, those described by P2 and P7 templates (Tab. 4) are significant. As previously described, these templates share a common feature. In them, the effect on the last target in the pathway is exerted by protein-protein interactions. P2 pathways are shorter, since they include only a single intermediate. These pathways involve protein-protein interactions only.
Mitochondrial network comprising all signaling pathways identified with the use of P2 template includes 28 proteins and 31 interactions (Fig. 1A). The network contains 7 viral proteins and 9 enzymes of aminoacyl-tRNA biosynthesis. The cytoplasmic P2 pathways include 32 proteins, nine of which are viral proteins, while ten are the aminoacyl-tRNA biosynthesis enzymes (Fig. 1B). Merge of cytoplasmic and mitochondrial P2 networks results in 53 protein members (11 viral proteins and 19 aminoacyl-tRNA biosynthesis enzymes) and increases the number of interactions to 57.
The merged P7 network of mitochondrial and cytoplasmic signaling pathways is shown in Fig. 2. P7 pathways include 3 intermediates (Tab. 2). The first intermediate is a protein interacting with viral protein. The expression of a gene, which represents the second intermediate, is regulated by intermediate 1. The third intermediate is the protein product of intermediate gene 2. At the end of pathway, intermediate 3 is involved in protein-protein interactions with a target protein of the considered KEGG metabolic process. As seen in Fig. 2, the P7 pathway network is greater than the P2 network. It includes 111 proteins (15 viral proteins and 19 aminoacyl-tRNA biosynthesis enzymes), as well as 23 genes. The network also involves 110 interactions, with 60 of them being protein-protein interactions, 27 – the regulation of expression, and 23 – the expression itself (production of protein gene products).
Noteworthy, the cytoplasmic and mitochondrial pathways of aminoacyl-tRNA biosynthesis regulation by viral proteins are different. For example, Fig. 2A shows that SYLC (cytoplasmic leucine-tRNA ligase) can be indirectly affected by 2 viral proteins, orf8 and nsp8. According to the recent publication [13], orf8 interacts with Endoplasmic reticulum resident protein 44 (EPR44), which, in turn, can interact with SYLC (BioGrid Id 922540). Functional effects of these interactions require further investigation. ERp44 supervises the correct assembly of multimeric proteins linked by disulfide bonds in the endoplasmic reticulum and their secretion [25].
Exosome complex component RRP4 (EXOS2) can serve as an intermediate between Nsp8 and SYLC, as the information about EXOS2 and SYLC interaction is contained in BioGrid database (Id 2457178). In the cytoplasm, the RNA exosome complex is known to be involved in general mRNA turnover due to its specific degradation of inherently unstable mRNAs containing AU-rich elements in 3' untranslated regions and in RNA surveillance pathways, due to prevention of aberrant mRNA translation [26].
According to Fig. 1B, the mitochondrial Leucine-tRNA ligase (SYLM) can be affected by viral proteins E and M. M protein can act onto SYLM via two pathways. One is mediated by MPPB (Mitochondrial-processing peptidase subunit beta), the other one by FAKD5 (FAST kinase domain-containing protein 5). MPPB cleaves the mitochondrial sequence off the newly imported precursor proteins [27]. BRD2 (Bromodomain-containing protein 2) turnes out to be an intermediate between E and SYLM. The data on MPPB, FAKD5 and BRD2 interactions with SYLM is contained in BioGrid (Id 2857559, Id 28550757 and Id 2538529, respectively).
Of note, all the discussed signaling pathways are only hypothetic and based on the integration of knowledge obtained from different experiments, thus, they require further corroboration.
Glycine, serine and threonine metabolism
Glycine, serine and threonine metabolism is the second top process of overrepresented KEGG processes (Tab. 3). Of 33 metabolites participating in this process, ten are significantly different between plasma samples of COVID-19 patients and the controls (Supplementary Tab. S4). Nine of the latter are increased in content, while glyceric acid is decreased (logFC=-0.67).
Reconstruction of signaling pathways describing the potential regulation of glycine, serine and threonine metabolism is performed with the use of P5 template, which proves significant for this metabolic process (Tab. 4). P5 template describes the pathway of potential metabolic gene expression regulation by viral protein and two human intermediate genes/proteins. Of 40 genes of KEGG process of glycine, serine and threonine metabolism, ten are the potential targets of viral proteins (Supplementary Tab. S5). In Fig. 4, nine viral proteins participate in gene expression regulation of this metabolic process. Particularly, orf8 protein can influence the expression regulation of ALDH7A1 coding for AL7A1 protein. This enzyme (EC 1.2.1.8) participates in betaine biosynthesis (Fig. 3). The potential signaling pathway initiated by orf8 is represented by the following chain of interactions in gene network. Orf8 interacts with SMOC1 [13], SMOC1 can activate BMP2 gene expression [28], while BMP2 inhibits ALDH7A1expression [29]. As another example of signaling pathway from gene network shown in Fig. 4, we can consider the expression regulation of PHGDH gene coding for SERA (D-3-phosphoglycerate dehydrogenase) by viral protein nsp5. This enzyme (EC 1.1.1.95) catalyzes the reversible oxidation of 3-phospho-D-glycerate to 3-phosphonooxypyruvate, the first step of the phosphorylated L-serine biosynthesis pathway. According to the recent publication [13], nsp5 interacts with histone deacetylase 2 (HDAC2). HDAC2 can, in turn, inhibit the expression of tumor suppressor p53 [30-32]. The last link in nsp5-SERA signaling pathway is represented by p53 and PHGDH interaction. The former is known to suppress PHGDH (SERA) expression and inhibit serine biosynthesis [33].
Arginine biosynthesis
Arginine biosynthesis is the third top process among overrepresented KEGG processes (Tab. 3).
Of 23 metabolites participating in this process, six are significantly different between the plasma samples of COVID-19 patients and the controls (Supplementary Tab. S6). Five of them are increased in content, while ornithine is decreased (logFC=-1.98).
The potential regulation of arginine biosynthesis by viral proteins differs significantly from the two KEGG metabolic processes considered above (Supplementary Tab. S7). Of 22 genes involved in arginine biosynthesis, fourteen represent the potential targets of viral proteins (Fig. 5). Five types of signaling pathways potentially involved in the regulation prove statistically significant including pathways of P2, P4, P5, P6 and P7 types (Tab. 4). The viral proteins can potentially regulate the expression of enzymes of this metabolic process or the expression of human proteins, which interact with these enzymes, or they can control the activity or stability of these enzymes via the mentioned types of pathways.
Gene network of expression regulation is shown in Fig. 6. It includes 74 genes and 132 proteins, nine of which are arginine biosynthesis enzymes, while 15 represent the viral proteins. The network also includes 194 links between proteins and genes, which describe expression regulation and 49 links defining the protein-protein interactions between viral and human counterparts.
Fig. 7 shows an example of signaling pathways included in gene network in Fig. 6, which can be involved in arginase 2 expression regulation. ARG2 (EC:3.5.3.1) participates in L-ornithine and urea synthesis from L-arginine and may play a role in the regulation of extra-urea cycle of arginine metabolism (Fig. 5). As seen in Fig. 7, viral proteins E, nsp5, orf8, and orf3a can regulate ARG2 expression. In particular, nsp5 can form complexes with histone deacetylase 2 (HDAC2) protein (Gordon et al, 2020). In turn, HDAC2 can suppress arginase 2 expression [34]. nsp5 binding to HDAC2 can potentially affect HDAC2 function. However, further experimental studies and computer simulations are needed to clarify the functional role of these protein-protein interactions. In particular, interesting results can be obtained by the computer-assisted structural modeling of protein complexes.
An analysis of pathways related to the regulation of activity/stability of arginine biosynthesis enzymes reveals 6 viral proteins (E, nsp5, nsp14, orf3a, orf8 и orf9c) potentially involved in these pathways (Fig. 8). Six enzymes are associated with this type of regulation. NOS3 protein takes the central place in the network in this figure. Its activity can be potentially regulated by 4 viral proteins. Additionally, some of the latter can be involved in multiple ways, for example, orf8 and E participate in 3 and 2 pathways, respectively. Thus, one of the pathways involving orf8 is realized via its interaction with disintegrin and metalloproteinase domain-containing protein 9 (ADAM9), while another includes the interaction with protein-lysine 6-oxidase (LYOX). ADAM9 is known to increase vascular endothelial growth factor A (VEGFA) expression in lung cancer metastasis [35]. In the gene network, this interaction between ADAM9 and VEGFA is presented as a potential one. According to one publication [36], LYOX positively regulates VEGFA expression. In turn, VEGFA can upregulate NOS3 function by phosphorylation of a specific serine residue [37].
Gene network presented in Fig. 9 describes the potential effects of viral proteins on protein-protein interactions of arginine biosynthesis enzymes. This network isas reconstructed with the use of P2 and P7 templates. It includes 78 objects (15 genes and 63 proteins) and 94 interactions. Twelve viral proteins and twelve arginine biosynthesis enzymes participate in the network. P2 pathways describe the potential interactions of viral proteins with arginine biosynthesis enzymes, which are mediated by a single intermediate protein. For example, citron Rho-interacting kinase (CTRO) can mediate the interaction between nsp13 and aminoacylase-1 (ACY1), which is a part of arginine biosynthesis pathway. Protein-protein interactionы between nsp13 and CTRO were described previously [13], while CTRO interaction with ACY1 is included in BioGrid database (Id 2538152). One can expect that the first interaction (nsp13/CTRO) can have an adverse effect on the second one (CTRO/ACY1). However, the effect of such interaction on ACY1 function in arginine biosynthesis should be further studied. Interestingly, 4 proteins of arginine biosynthesis (NOS1, NOS3, DHE4 и ACY1) represent the potential nsp13 targets, which can be affected by this viral protein via P2 type pathways. The pathway linking viral protein E to NOS3 provides an example of P7 type pathways. It includes the following chain of potential interactions: E protein interacts with Bromodomain-containing protein 4 (BRD4) with the formation of a protein complex [13], BRD4 regulates Endothelin receptor type B (EDNRB) gene expression [38], while, according to HPRD database [39], EDNRB interacts with NOS3 (HPRD Id 01224, HPRD Id 01224).