Cell culture and protein collection:
TAILS is based on the quantitative comparison of N-terminal peptides from protease-treated and control samples. The following protocol developed for studies of cell-conditioned medium proteins (secretome) can be easily adapted to cell lysates or samples derived from other sources. The introduction of the protease of interest, its inhibition or silencing can be done at the cellular level prior to proteome collection or in vitro after the proteome has been harvested. The latter requires collection under conditions to maintain the native structure of the constituent proteins. The minimal recommended protein amount is 100 μg for each sample (i.e. 100 μg for control and 100 μg for protease-treated), and can generally be achieved by collecting serum free condition medium from at least 6 cell culture flasks at around 70-80% confluence (175 cm2, T175).
Grow cells in appropriate media up to 70% confluence.
Decant media and wash cells extensively (at least 3 times) with PBS to remove serum proteins.
Add serum-free media (i.e. the same medium used for growing the cells but without the addition of serum), usually 20 mL per T175 flask.
Grow cells overnight to synchronize the cells.
Decant media and wash cells at least 3 times with PBS.
Add fresh serum-free, phenol-free media. By using a lower amount of medium than for normal cell culture the secreted proteins will be more concentrated. The time of addition is set as the starting time.
Grow cells for the required time usually 24 h depending on the requirements of the experiment and tolerance of cells to serum free conditions.
Note: After 24 h, serum starvation might occur. If cells are grown for shorter times larger number of flasks will be required to accrue sufficient quantities of protein in the medium.
Collect conditioned media in 50 mL tubes (i.e. 2 flasks per 50 mL tube).
Centrifuge conditioned media at 2,200g at 4 °C for 5 min to remove any cells.
Add protease inhibitors such as PMSF (1 mM final), EDTA (1 mM final), E64 according to the experimental question being addressed. For a complete list of protease inhibitors for all classes of proteases see reference 27. It is important to minimize any background proteolysis that inevitably occurs in all proteome samples after collection.
Note: Excess and reversible protease inhibitors will be removed in the following steps by dialysis, however when the protease of interest is added in vitro after secretome collection, inhibitors of that protease should be avoided.
- Filter supernatant using Millipore "Steriflip":http://www.millipore.com/catalogue/module/c3238 or equivalent.
Pause: At this point it is possible to freeze the samples in liquid nitrogen and store at -80 °C.
- Apply the protein samples to protein concentration devices such as Millipore "Amicon-Ultra 15":http://www.millipore.com/catalogue/module/c7715 concentrators. Concentrate condition medium proteins at 4 °C following the manufacture instructions to ==~== 1 mL volume.
Note: To minimize the time for this and the following steps it is recommended to use several concentrators for treating each sample (i.e. one concentrator per 40 mL of collected conditioned medium proteins).
- Add 14 mL of desired buffer to each concentrator. We recommend using 100 mM HEPES pH 7.0.
Note: TAILS is based on the labeling of peptide primary amines and thus, other molecules with primary amines will interfere with the labeling step resulting in incomplete labeling of peptides. Thus primary amine containing buffers such as ammonium bicarbonate or Tris must not be used. The purpose of the following buffer-exchange steps is to deplete the sample of free amino acids and other compounds with primary amines. If the protease of interest is added in vitro after secretome collection, the buffer of choice, pH and other additives should allow the optimal activity of the studied protease. It is also recommended to exclude any detergents as these can interfere with MS later.
Concentrate sample again to a volume of ==~== 1 mL or less.
Repeat steps 13 and 14 at least 3 times.
Optional: if several concentrators were used for each sample, pool all concentrates and concentrate them to final volume of ==~== 1 mL. Carefully recover as much protein as possible from each concentrator by gentle pipetting of the sample over the membrane before removal.
Measure protein concentration using your method of choice e.g. BCA or Bradford Assay.
Bring protein concentration to ==~== 1 mg/mL using buffer of choice (see section 13 above).
Keep a small aliquot of each sample (control and protease-treated) for quality control purposes designated "before labeling".
Cell culture: 1-7 days (depending on cells grown and required amounts).
Media collection: 1 hour.
Media concentration and buffer exchange: 3-6 hours (depending on sample volume and protein concentration).
Optional: Test protease cleavage of collected proteome
The following steps are required only if the proteome is exposed to the protease of interest in vitro.
Divide the proteome in two equal aliquots. Optional: prior to this, add a known substrate of the test protease that will serve as a positive control and allow for validation of the proteolytic activity and sensitivity of the TAILS procedure. If possible select a known substrate of different species than the test proteome (e.g. murine if the proteome is human), in particular a substrate protein that has a different tryptic peptide spanning the cleavage site to avoid ambiguous identification if the source secretome also has the protein. Typically 0.5-1 μg of known substrate can be added to 200 μg proteome.
Add activated protease to the sample and an equivalent amount of buffer to the control sample. Typical protease to proteome ratios are 1:1000-1:50 (w/w) with 1:100 (w/w) being a useful ratio to be used for the first time. This will likely ensure that cleaved neo-N-terminal peptides can be identified. With experience the ratios of the protease to proteome can be reduced.
Incubate for 1-24 h at a temperature suitable for the protease under investigation.
Optional: heat inactivate the protease and control samples.
Time taken - up to 24 hours depending on the selected assay conditions.
Isotopic labeling of samples
TAILS is based on isotopic labeling of the primary amines at protein N-termini and lysine side chains. Therefore any primary amine reactive isotopic reagent can be used. We chose to label the proteins by dimethylation using 12CH2 -formaldehyde (light) and 13C2H2 -formaldehyde (heavy) and sodium cyanoborohydride (NaBH3CN, ALD reagent) as the catalyst28. This labeling approach is very fast, robust and efficient and utilizes relatively cheap reagents ( ==~== $1/labeling reaction). The labeling procedure must be carried out separately for the control and protease-treated sample.
- Denature protein samples by adding 8 M GuHCl to a final concentration of 4 M GuHCl.
Note: The labeling reaction can be carried out efficiently at lower denaturant concentrations but will require more time. Do not use urea which will modify amino acid residues in the sample and so reduce peptide identifications.
Check pH by pipetting 1 μL of sample onto a pH strip. Hamilton microcapillary tubes can also be used for 100 nL volumes.
Adjust pH to 7.0 by addition of small volumes of 1 N HCl or 1 N NaOH.
Reduce cysteine residues by adding 1.0 M DTT to a final concentration of 5 mM.
Incubate sample at 65 °C for 1 hour.
Cool samples to room temperature.
Caution: Cooling down sample temperature prior to addition of iodoacetamide (IAA) is essential for preventing lysine modification by IAA29.
Alkylate cysteines by adding 0.5 M IAA to final concentration of 10 mM.
Incubate sample at 25 °C in the dark for 30 minutes.
Quench excess IAA by adding 1.0 M DTT to a final concentration of 30 mM DTT.
Incubate for 30 minutes to achieve full quenching of IAA.
Caution: Formaldehyde and sodium cyanoborohydride (ALD reagent) are extremely toxic and carcinogenic reagents. Therefore extra care should be taken while handling these. In addition, during the reductive dimethylation reaction lethal hydrogen cyanide gas is emitted. Therefore, all of the following steps should be performed in a fume hood.
- Prepare 2.0 M working stocks of 13C2H2 -formaldehyde (heavy) and 12C1H2 -formaldehyde (light) in water.
Note: The concentrations of the light and heavy formaldehyde stock solutions as supplied by the manufacturer are different: 37% or 12.3 M, and 20% or 6.6 M, respectively.
- Add light formaldehyde to one sample (control) and heavy formaldehyde to the protease sample to a final concentration of 40 mM light/heavy formaldehyde.
Note: If the experiment is repeated several times, labeling swaps are recommended for validation. By convention, heavy labels are used for the protease sample.
Add ALD reagent to each sample to a final concentration of 20 mM.
Vortex samples and adjust pH to 6-7 if required by adding 1.0 N HCl or 1.0 N NaOH (check pH as in step 2).
Incubate for at least 4 hours at 37 °C, but overnight is recommended.
Quench excess formaldehyde by adding 1.0 M ammonium bicarbonate to each sample up to a final concentration of 100 mM.
Vortex samples and check pH (as in step 2). If required adjust pH to 6-7 by adding small volumes of 1.0 N HCl or 1.0 N NaOH.
Incubate for at least 4 hours at 37 °C.
Keep a small aliquot (1-5%) of each sample for labeling validation (for troubleshooting purposes); label samples as "heavy" and "light".
Suggestion: A fast, labeling test can be preformed by analyzing the non-labeled (start) and light and heavy labeled samples by MALDI-TOF MS. Successful labeling is characterized by a complete shift by ?6 m/z of the observed peaks in the labeled samples compared to the nonlabeled samples.
Time taken - 9 to 24 hours.
Proteolytic digestion of labeled samples
The sample must be tryptic digested to prepare the sample for proteomics analysis. As will be described below and as in many other proteomics procedures, it is highly recommended to use trypsin for this purpose, although GluC or chymotrypsin can also be used. Following primary amine dimethylation, trypsin cannot cleave at the blocked lysines and so will cleave with ArgC specificity (cleaving C-terminal to arginine residues only). This generates longer peptides which significantly improves the likelihood of identifying neo-N-terminal peptides that have otherwise been shortened by proteolysis26. If GluC or chymotrypsin are used this advantage is lost.
Labeling reagents clean-up
Combine quenched heavy and light labeled samples in a 15 mL tube.
Keep small aliquot for quality control and label it "labeled samples before precipitation".
Add 8 sample volumes of ice-cold acetone and 1 sample volume of methanol to the labeled proteins.
Caution: acetone and methanol should be stored in chemical resistant containers (i.e. glass bottles). The use of plastic-ware for storage can result in the extraction of contaminating plastic polymers into the solvent that will affect MS results.
Aliquot 1.2 mL sample into 1.5 mL microfuge tubes (unless a centrifuge capable of 15,000g for 15 mL tubes is available, in which case continue using the 15 mL tubes).
Precipitate labeled proteins for at least 4 hours at -80 °C, but overnight is recommended.
Centrifuge the samples at 14,000g in 4 °C for 10 min and carefully discard the supernatant.
Add 1 mL of ice-cold methanol to each tube (or 5 mL if 15 mL tube is used).
Note: Washing the acetone pellet with methanol prevents unwanted acetylation of the N-termini of tryptic peptides in case of any NaCNBH3 carry-over.
Centrifuge the samples at 14,000g in 4 °C for 10 min and carefully discard the supernatant.
Repeat steps 7 and 8.
Air-dry the sample.
Note: Do not overdry the sample as it will be difficult to dissolve.
Resuspend samples 8 M GuHCl. For 1.5 mL it is recommended to start with 20 L and increase volume if required. Use the minimal volume required to completely resuspend the sample.
Add 9 volumes of 50 mM HEPES pH 8.0 to each tube (i.e. 180 μL if 20 μL of 8 M GuHCl was used) and combine resuspensions from all tubes. Following this the final concentration of GuHCl should not exceed 0.75 M, which is suitable for trypsin digestion. If another protease is used, the final GuHCl should be adjusted accordingly.
Keep a small aliquot (1% of total or 1 μL if testing by MALDI-TOF) for quality control and label "labeled samples after precipitation".
Time taken - 6 to 24 hours.
Check pH and if required adjust to pH 8.0 by adding small volumes of 1.0 N HCl or 1.0 N NaOH.
Add mass spectrometry grade trypsin to a final ratio of 1:50 protease/protein (i.e. 4 μg trypsin per 200 μg sample) and gently pipette up and down to mix sample.
Incubate overnight (18 h) at 37 °C.
Optional: add additional trypsin as in step 2 and incubate for an additional 4 hours at 37 °C to ensure complete digestion.
Keep a small aliquot for quality control and label it "labeled samples after digestion".
Optional: It is highly recommend to plan ahead and prepare sufficient starting material that will allow MS analysis of the sample prior to polymer negative selection. If the sample amounts permit such analysis, an aliquot should be stored for this purpose at this point.
Time taken - 18 to 24 hours.
Quality control for labeling and digestion
To verify the successful completion of the above steps, run a 10% SDS-PAGE gel followed by silver staining of the aliquots that were stored in the previous steps: labeled samples before precipitation, labeled samples after precipitation, and labeled samples after digestion. Ensure similar protein bands and intensities appear before and after precipitation and that there is disappearance of all bands higher than 10 kDa after proteolytic digestion. Mismatching protein bands before and after precipitation indicates sample losses that will reduce the quality of the MS analysis. Protein bands after tryptic digestion may indicate incomplete digestion (e.g. due to a bad protease batch) and require repetition of the digestion step.
Time taken - 4 hours.
Negative selection of blocked peptides using HPG-ALD polymer and MS/MS readout
This step enriches the naturally blocked as well as the dimethylated and labeled N-terminome peptides by negative selection. In the previous steps, protein original free N-termini and the protease-generated neo-N-termini were dimethylated, so together with naturally blocked N-termini (e.g. by acetylation and cyclization) of proteins, they all posses blocked N-termini. Trypsin digestion generated internal peptides with free N-termini. The HPG-ALD polymers developed for TAILS contain many aldehyde functional groups that readily react and bind the free N-terminal internal tryptic and C-terminal peptides when mixed with the digested sample in the presence of sodium cyanoborohydride. In contrast, the naturally blocked and isotopically-labeled mature N-terminal and neo-N-terminal peptides (and dimethylated lysines) are unreactive and will remain unbound for recovery by ultrafiltration.
"HPG-ALD":http://www.flintbox.ca/technology.asp?page=3081 polymer usually supplied at a concentration of ==~== 35 mg/ml. Although the polymer was dialyzed extensively it is recommended to dialyze again prior to use.
Dialyze 0.5 ml of HPG-ALDII polymer against 4 L of water overnight at room temperature with agitation.
Split HPG-ALDII stock into 20 µL aliquots in microfuge tubes.
Flow argon gas on top of the liquid for 1 minute per each tube.
Caution: do not use strong gas flow as it will cause the solution to splash out of the tube.
- Close the microfuge tubes and freeze the polymer solution in liquid nitrogen. Store the polymer at -80 ºC. These aliquots are ready to be used for experiments.
Note: If the polymer solution is frozen other than by liquid nitrogen, a gel-like, opaque solution will be formed upon thawing that will require about one hour to form a clear, usable solution.
Polymer negative selection:
HPG-ALDII polymer has a binding capacity of 2.5 mg of peptide per mg of polymer. However, there are different versions of the HPG-ALD polymer with different binding capacity26. Therefore, if a different version of HPG-ALD is used, the amount of polymer for the capture should be modified accordingly. Removal of the internal tryptic peptides results in a 90% decrease of the total peptide content, thus a maximum of ==== 20 µg of peptides can be recovered in the N-terminal enriched sample. The peptide content is in fact lower (around 10 µg) due to sample loss through the different steps.
Add dialyzed HPG-ALDII to the trypsinized sample. We recommend capturing 100 µg of peptides with 200 µg of HPG-ALDII, representing a 5-fold excess of polymer. Therefore, if the polymer solution concentration is 35 mg/mL, 10 μL of polymer stock should be added per 100 μg of tryptic digest.
Add ALD reagent to final concentration of 20 mM.
Check pH and if required adjust to pH 6-7 range by adding small volumes of 1.0 N HCl or 1.0 N NaOH.
Incubate overnight at 37ºC.
Add 1.0 M ammonium bicarbonate to 100 mM final concentration.
Note: This step is used for blocking the excess functional aldehyde groups of the polymer which improves yield and reduces non-specific binding of peptides to the polymer.
Check pH and if required adjust to pH 6-7 by adding small volumes of 1.0 N HCl or 1.0 N NaOH.
Incubate at 37 ºC for 30 minutes.
Time taken - 6 to 12 hours.
Recovery of unbound blocked and labeled peptides:
Pretreat a 10-kDa molecular cutoff Microcon spin-filter (Millipore) with 400 µL of water as per manufacturers instructions. Do not allow membrane to dry out before sample addition.
Load the tryptic digest/polymer reaction mixture.
Filter by centrifugation at 14,000 rpm for 15 min.
Monitor the sample volume above the filter and centrifuge until there remains just a few µL on the filter.
Collect the filtrate, which contains the enriched N-terminal peptides. The internal tryptic peptides that are covalently bound to the polymer are retained by the filter.
Wash the filter by adding 200 µL of 100 mM ammonium bicarbonate buffer and centrifuge again.
Collect the filtrate and combine it with the filtrate of Step 5.
Time taken - up to 3 hours.
Desalting of blocked and labeled peptide solution:
This step is performed utilizing a C18 reverse-phase solid phase extraction cartridge. We used Waters Sep-Pak light, which accommodates a relatively small volume with a high binding capacity thus providing convenient sample concentration at this step.
Acidify the pooled filtrates (obtained at step 7 above) to pH 3 by adding formic acid and dilute to 3 mL 0.1% formic acid in water.
Condition a Sep-Pack light C18 cartridge by injecting 5 mL of 80% acetonitrile, 20% water, 0.5% formic acid with a syringe.
Discard the flow-through.
Caution: Do not dry the cartridge by introducing air at the end of the injection. Always keep the cartridge wet.
Rinse the Sep-Pack light C18 cartridge with 5 mL of water with 0.1% formic acid and discard the flow-through.
Apply the sample to the cartridge at a maximum of 1 mL/min and collect the flow-through. Note: Measure the flow with a timer using the syringe volume marks.
Reapply the sample to the cartridge to improve peptide binding and recovery.
Wash the Sep-Pack light C18 cartridge twice with 5 mL 0.1% formic acid in water and discard the flow through.
Elute peptides with 1.5 mL of 80% acetonitrile, 20% water, 0.5% formic acid at a maximum of 1 mL/min. Collect the eluate into a microfuge tube.
Evaporate the eluate organic solvent under vacuum (using a speedvac).
Caution: Do not dry completely.
- Resuspend the peptides in 20 µL of 3% acetonitrile, 97% water, 0.1% formic acid. Store the samples at -80 ºC until mass spectrometry analysis.
Time taken - 3 hours (strongly depend on speedvac speed).
Identification of N-terminal Peptides by Liquid Chromatography-Tandem Mass Spectrometry
TAILS-enriched N-terminal peptides have been analyzed on quadrupole-time of flight QSTAR (ABI) and an LTQ-Orbitrap (ThermoFisher) mass spectrometers, but can be analyzed on any tandem mass spectrometer. An LTQ-Orbitrap mass spectrometer is preferred because of its fast duty cycle time and high mass accuracy. Using the Orbitrap, TAILS data coverage has proven excellent without sample prefractionation due to the massive sample simplification achieved following removal of the internal tryptic peptides. However, higher coverage and potentially better quantification accuracy can be obtained using a 2D peptide separation system following TAILS, such as strong cation exchange chromatography to generate 10 fractions, with each then being loaded separately on the mass spectrometer. A description of the LC-MS/MS setup is not within the scope of this protocol so here we outline the conditions used for the LTQ-Orbitrap TAILS analysis26 only briefly. These steps can be easily adapted to other mass spectrometers.
- Load peptides onto a C18 reverse-phase (3 µm ReproSil Pur C18 beads) capillary column (15 cm, 75 mm inner diameter fused silica emitter with a 8 mm diameter opening) with a nanoflow HPLC in-line with the mass spectrometer as described.
Optional: If required, prior to loading the column, desalt peptide sample using STop And Go Extraction (STAGE) tips30.
Elute the peptides from the reverse-phase column with a gradient composed of Buffer A (0.5% acetic acid) and Buffer B (0.5% acetic acid and 80% acetonitrile) and inject it directly into the mass spectrometer by ion-spray ionization. The gradient is formed with 6 to 30% Buffer B in 60 min, then from 30 to 80% Buffer B in 10 min and held at 80% of Buffer B for 5 min.
Acquire MS1 scans between 350 and 1,500 m/z at a resolution of 60,000 and select the five most intense ions for fragmentation. Repeat this cycle for the period of the gradient.
Time taken - 2-3 hours.
Data Analysis of the TAILS Tandem Mass Spectroscopy Spectra
Unlike most proteomics procedures, the successful outcome of TAILS negative selection is the generation and isolation of peptide "single hits" where a protein identification is based on a single peptide (i.e. only the original or neo-N-terminal peptide of each protein)32. To address this issue we set robust statistical and bioinformatics criteria. We collect 3 different biological samples for analysis that are treated and analyzed independently. High quality and accurate MS data are acquired using a Fourier-transform mass spectrometer. For the identification of protease generated neo-N-terminal peptides, database searches are performed utilizing 2 search engines. We use Mascot and X! Tandem. The search parameters include N-terminal and lysine dimethylation modifications. Data analysis depends on LC-MS/MS vendor-specific data formats and the choice and access to specific analysis software. We chose to analyze the data using the open source Trans-Proteomics Pipeline (TPP) software from the Systems Biology Institute in Seattle31, which allows input from different mass spectrometers and can incorporate MS/MS search results from different search engines including free and open-source engines such as X! Tandem and Omssa as well as the commonly used commercial programs Sequest and Mascot. For peptide identification we use an orthogonal validation strategy whereby the N-terminus must be dimethylated and the peptide must be within 5 p.p.m. of the expected m/z. Next, the peptides must pass PeptideProphet in the TPP with a false discovery rate of less than 1%.
After high confidence peptide identification and protein identification the protease substrates are then identified by a process we term hierarchical substrate winnowing26. The relative quantification (protease-treated vs control) of each peptide found in the database searches is analyzed using the XPRESS quantification tool included in the TPP. In this process high confidence cleavage sites must have a peptide abundance ratio (heavy/light, protease/control) >3. Such peptides are compiled for each of the three biological samples from the technical replicates and further validated through the TPP to generate a list of substrate candidates identified by single peptides in multiple biological samples. Only the following high ratio peptides are then selected as high confidence potential protease-generated peptides that are identified in more than one biological replicate sample, or in multiple charge states, or with and without oxidized methionine, or with and without arginine or glutamine deamidation. The rationale is that a peptide must be identified twice: either the same peptide in different samples or in different states in the same sample. It should be noted that a less stringent validation could be used when a protease with known and narrow cleavage specificity is being examined. The double identified, high confidence, high ratio neo-N-terminal peptides are then further examined to select the most biologically relevant candidate substrates for the tested protease. For proteases belonging to a family, candidate substrates can be winnowed from the candidates that are known to be cleaved by other proteases in the family, or are protein family members of a substrate cleaved by the protease.
The identification of protein naturally blocked N-terminal peptides is performed by simply altering the database search parameters (i.e. including peptide N-terminal acetylation or cyclization instead of dimethylation) and by utilizing protein database annotations.
We provide below examples based on data originating from a Thermo LTQ-OrbitrapTM instrument (".RAW" file format) analyzed by Mascot33 and X! Tandem34 for uninterpreted database searches. The steps are described briefly and detailed information regarding TPP usage can be found on the "TPP wiki":http://tools.proteomecenter.org/wiki/index.php?title=Main_Page, "tutorial":http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial, "TPP users discussion list":http://groups.google.com/group/spctools-discuss?pli=1, and in the links provided below.
Time taken - hours to days.
Make a directory for each analysis (Mascot protease cleavage, X! Tandem protease cleavage, Mascot N-terminal and so on).
Convert LTQ-Orbitrap RAW data to mzXML format in profile mode (not centroid) using "ReAdW":http://tools.proteomecenter.org/wiki/index.php?title=Software:ReAdW tool in the TPP.
Note: it is possible to convert to mzXL format, which is the new standard MS format set by HUPO - but the usage of this format for TAILS data analysis has not been tested yet.
- Place 2 copies of the mzXML file in each directory and name differently for heavy and light. If required (see more details below on Mascot database search section).
Note: The TPP requires that the mzXML and all other files generated during the analysis of a data set are kept in the same directory. If possible semi-links/shortcuts can be use to avoid storage of multiple copies of the mzXML files.
- For Mascot searches convert the mzXML files to Mascot generic format (.mgf extension) using MzXML2Search tool in the TPP (use defaults).
Mascot database search analysis:
TPP quantitative analysis of "Mascot":http://tools.proteomecenter.org/wiki/index.php?title=TPP:Mascot_and_the_TPP search data for dimethylation requires running 2 separate searches, one for only heavy labeled peptides and one for only light labeled peptides. We recommend the use of decoy sequences in the searched database in order to improve peptide assignment validation in later steps of the analysis35. For more information about decoy sequences, their generation and implantation in the database see "Mascot help":http://www.matrixscience.com/help/decoy_help.html.
Run a Mascot search for only light dimethylated peptides using the mgf file as input against an appropriate database using the following search parameters: Semi-ArgC cleavage specificity; up to 3 missed cleavages; precursor ion mass tolerance 10 ppm; fragment mass tolerance 0.8 Da; fixed modifications: cysteine carbamidomethylation (+57.021464), peptide N-terminal and lysine dimethylation (+28.031300); variable modification: methionine oxidation (+15.994915); scoring scheme ESI-TRAP.
Repeat the search for heavy dimethylated peptides by changing the settings of fixed modification of peptide N-terminal and lysine residue to heavy dimethylation (+34.063117).
Import the search result files (.dat extension) from the Mascot server to the analysis directory and name them according to the input file name (i.e. data1.dat for data1.mgf and so on).
Convert the .dat files to pepXML file using "Mascot2XML":http://tools.proteomecenter.org/wiki/index.php?title=Software:Mascot2XML tool. Do it for both the light and heavy labeled searches.
Merge the heavy light search results, analyze and validate peptide MS/MS identifications and quantification using the "XInteract and XPRESS"::http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial#Peptide_Level_Analysis, "PeptideProphet":http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial#Peptide_Level_Analysis tools of the TPP (respectively).
The output of this step is an interact pepXML file that includes all the peptides and their related information (database search score, PeptideProphet score, relative abundance etc.).
Note: All of these steps can be executed in a single step though the TPP "Petunia":http://tools.proteomecenter.org/wiki/index.php?title=TPP:Using_Petunia interface (GUI).
X! Tandem database search analysis:
Quantitative analysis of "X! Tandem":http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tandem_search searches for dimethylated peptides with the TPP does not require use of separate searches for heavy and light labeled peptides, but for simplification we will use the same search approach. In order to combine X! Tandem search results with Mascot results it is important to use the same database for both searches.
- Perform an X! Tandem database search for light-labeled peptides directly from the "Petunia":http://tools.proteomecenter.org/wiki/index.php?title=TPP:Using_Petunia interface of the TPP using the "k-score":http://tools.proteomecenter.org/wiki/index.php?title=TPP:X!Tandem_and_the_TPP option (X! Tandem searches are done with mzXML as input). Using the same parameters used for Mascot light search: Semi-ArgC cleavage specificity; up to 3 missed cleavages; precursor ion mass tolerance 10 ppm; fragment mass tolerance 0.8 Da; fixed modifications: cysteine carbamidomethylation (+57.021464), peptide N-terminal and lysine dimethylation (+28.031300); variable modification: methionine oxidation (+15.994915).
The X! Tandem search will generate a .xml file with the search results data in the same directory of the mzXML file used for the search.
Note: Running X! Tandem through the TPP is done by using an "input file":http://www.thegpm.org/TANDEM/api/index.html with the required parameters for the search.
Repeat the search for heavy dimethylated peptides by changing the settings of fixed modification of peptide N-terminal and lysine residue to heavy dimethylation (+34.063117).
Convert the .xml files to pepXML using "Tandem2XML":http://tools.proteomecenter.org/wiki/index.php?title=Software:Tandem2XML tool for both the light and heavy labeled searches.
Merge the heavy and light search results, analyze and validate peptide MS/MS identifications and quantification using the "XInteract":http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial#Peptide_Level_Analysis, "PeptideProphet and XPRESS":http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial#Peptide_Level_Analysis tools of the TPP (respectively). The output of this step is an interact pepXML file that includes all the peptides and their related information (database search score, PeptideProphet score relative abundance and so on).
Note: All of these steps can be executed in a single step though the TPP "Petunia interface":http://tools.proteomecenter.org/wiki/index.php?title=TPP:Using_Petunia (GUI).
Data validation and selection of potential protease generated hits:
Combine the pepXML interact files of Mascot and X! Tandem using the "iProphet":http://tools.proteomecenter.org/wiki/index.php?title=TPP_Demo2009#8._Further_peptide-level_validation_iProphet tool of the TPP. This will generate a pepXML file with a combined list of identified and quantified peptides.
Open the resulting iProphet pepXML file (using pepXML viewer).
Determine PeptideProbability score that corresponds to a false discovery rate of 1% by using the "calculate stats" option under "other options" tab.
Select peptides with a PeptideProbability score above the score found in step 2, corresponding to a false discovery rate of 1%.
Manually verify the quantification data (extracted ion chromatograms from "XPRESS":http://tools.proteomecenter.org/wiki/index.php?title=TPP_Tutorial#XPRESS_Results) of peptides with undefined ratios and heavy and light singletons and correct if required.
Export the final list of peptides to a Microsoft Excel sheet using the "export spreadsheet" option under "other options" tab.
Select peptides with high (>3) and low ratios (<0.33), which are considered as potential substrates of the protease of interest.
Perform these steps separately for all biological repeats analyzed by TAILS.
Compare the high and low peptide lists obtained in steps 7 and 8 and select only the peptides appearing in two biological samples or that were identified by two different tandem mass spectra (different charge states, different labels, oxidized and non-oxidized methionine).
Optional: as extra validation select only high and low peptides with precursor mass error <5 ppm (experimental v.s. theoretical).
Analysis of natural N-termini of proteins:
Analysis of natural N-terminal peptides of proteins using TAILS requires only changing the database search parameters. For analysis of acetylated peptides the search parameters listed above should be changed by replacing the fixed modification on peptide N-termini from dimethylation (+28.031300 or +34.063117) to acetylation (+42.010565). The rest of the parameters should remain the same.
It was suggested that for general N-terminome mapping aiming at enrichment and identification of proteins natural N-terminal peptides, the validation of "single hits" could also rely on the actual position of the identified peptide within the protein sequence13,15. Obtaining peptide positional information is dependent on the quality of the database annotation. To assist in obtaining the positional information of identified peptide we recommend the usage of a documented Perl script developed in our lab that can be found at "www.clip.ubc.ca/resources/index.html":http://www.clip.ubc.ca/resources/index.html, under CLIP-PICS36.