Manuelito Documentation


Abstract

Manuelito allows a rapid interpretation of complex posttranslational modification patterns based on the peptide mass fingerprints of proteins. It can be applied to proteins that contain natural PTMs as well as modifications introduced after isolation of the polypeptides.

The software package comprises a Java 2 standalone application that can be run on every Java-enabled platform and a Java web archive to be run as web server.

Posttranslational modifications (PTMs) are important modifiers of the physiological functions and behaviours of proteins like localization, intermolecular interactions, and enzymatic activities. Prime example for modulation of enzyme activities is the phosphorylation of enzymes of the metabolic pathway. After translation of the target protein, highly specific enzymes covalently link activated molecules to specific amino acid residues. In case of phosphorylation, ATP is the activated donor of phosphate and kinases are the enzymes that attach the phosphates to the proteins. Amino acids with free hydroxyl groups are usually targets for phosphorylation, i.e serines, threonines, tyrosines with serines being most frequently phosphorylated. The covalent linking of a chemical group is not a dead end street: there are also specific enzymes that can remove the modifications (e.g phosphatases that remove phosphates). Such a removal will, of course, reset the physiological behavior of the protein.

To summarize, the PTM framework comprises target proteins, activated substrates that provide the chemical group to be added, specific enzymes that transfer specific groups to specific residues of specific proteins, and specific enzymes that can again remove these groups. PTMs provide a means to reversibly change behaviours of proteins in a highly selective manner, and, actually, this feature is exploited extensively by the cells.

Many proteins contain more than a single residue that can be potential target for a PTM. In fact, many proteins appear to be multiply modified. Multiple modifications can not only serve as simple potentiator of a single modification. The simultaneous placement of different modifications might provide an additional level of coding a certain physiological function. Examples are nuclear factors that are potential acceptors for e.g. phosphorylation, acetylation and methylation groups. In the case of histone proteins, a clear coding potential by these modifications has been postulated: it is believed that combinations of certain modifications can be translated into a certain gene expression level.

Specific modifications of proteins are usually detected by protein- and modification- specific antibodies. This approach is based on raising antibodies against known modifications on known target sites. Two major drawbacks characterize this procedure: first, antibodies can only be raised against known target and secondly, the binding of an antibody can be compromised by an adjacent modification. An unbiased detection therefore requires a different technology.

PTMs add additional mass to proteins and peptides. With masses of modifications and native peptides being known, MALDI-MS can give information about whether a peptide carries which PTMs. However, positional information usually can't be obtained, i.e.it can't be determined on which residue the modification is placed in case there are more target sites than modifications attached.

It is evident that MALDI-MS cannot uncover positional information but can give first hints on the type and number of modifications present. In mathematical terms it decodes repetitive combinations NOT permutations.

For positional information MS/MS comes into play. By fragmenting the modified peptide, mass shifts of single amino acids can be visualized. This leads to identification of the modification's target residue. Unfortunately MS/MS procedures are tedious to perform and precise information is not always obtained.

The biological question behind the analysis of modification patterns is whether specific modification states correlate with specific physiological states. A strategy to efficiently get a comprehensive picture in various situations requires rapid and systematic analyses of different specimen. Therefore a screening procedure is desired that can limit the workload on bottleneck applications. Even though MALDI-MS has the aforementioned limitations, it can serve a useful function even in decrypting complex PTM patterns: it can be used to screen for changes in masses that could be indicative for changes in modification states. The peptides giving rise to changes can then be sequenced by MS/MS, the bottleneck application, and the precise position of modifications can be mapped.

Such a strategy requires that spectrum peaks (the masses) are being properly assigned to peptides and their modification states. Interpretation spectra is the job of computers and it is on the software to do the proper assignments.

Manuelito's task is to consider all modification states of peptides when trying to match peaks. In addition, Manuelito can handle modification events that cause selective modifications of residues already modified

Rational behind this feature is a particular way to process the peptides for analysis. Several procedures require the peptides to be chemically derivatized. This derivatization is basically a modification reaction that is applied to the peptide before MS. As it is for the naturally occurring modifications, the synthetic ones added by the researcher have a residue specificity and a specific mass. Reason for such a treatment are for example quantification experiments in which two different samples are analyzed at the same time. This procedure based on two different processing procedures for the two samples, i.e. 2 different chemical derivatization reactions. Another example is given by the analysis of histone proteins. These proteins are highly charged and therefore difficult to efficiently 'shoot' in the MS instrument. With chemical compounds that specifically target highly charged residues (i.e. lysines) the peptides charge can be neutralized.

Two important features characterize chemical modification reactions: a) these reactions can be considered obligatory, they will happen at almost 100% efficiency on each target residue b) but the linking of a chemical group might be excluded by an already existing natural modification.

Manuelito is the only software that can consider this modification-specific exclusivities and is therefore the only software the can be used to completely interpret MALDI-MS spectra of peptides that have been subjected to chemicals that exhibit such a behaviour.

In summary, Manuelito will generate protease cleavage fragments of a given set of proteins. For each peptide generated it will iterate through all possible modification states to try to match it to one of the peaks that have been passed to the application.

  1. One or more proteins are the source of the peptides analyzed by MALDI-MS.

  2. These proteins were target of various modification events (outside of labs, i.e. in nature). These modifications are considered variable.

  3. Optionally, the proteins are now or after the next step subjected to one further modification cycle. This is now in-lab modification, referred to as chemical modification. Chemical Modifications are considered to happen at 100% efficiency on unmodified target residues and, according to defined rules, on naturally modified ones as well.

  4. Proteins are subjected to protease cleavage that will not be affected by any modification (this is mainly due to lack of rules).

  5. The cleavage fragments are analyzed by mass spectrometry and give rise to peaks in the MS spectrum.

  6. The MS operator parses the peaks for obvious contaminations (keratins etc.).

  7. All masses are considered to be monoisotopic masses.

  8. Manuelito performs a match of the peak masses to theoretical peptides. As Manuelito considers virtually all modification scenarios per peptide, the efficiency, i.e. speed, of operation depends on the number of variable ('natural') modifications the user selects and the number of target residues within the cleavage fragments. It is therefore evident that the peptides should be fairly small, i.e. the user should limit the mass range of peak masses supplied. We suggest a range from 400-2500 Da. It is anyway almost impossible to further analyze larger peptides (by MS/MS for example).

  9. Based on the number and kind of modifications that are present on a matched peptide, as well as the number of missed cleavages that generated the peptide, Manuelito will assign a quite arbitrary penalty score to the match.

Note in advance: You can alwas store the current settings and the peptides in the result table into a manuelito project file that can be re-opened at later time (File->Save Project as...).

You will be presented one main window. There won't be many more. On the left side of the windows you will find 3 tabs. The main parameter input is done in the 'Settings' tab. Here the parameters for both, the search and the list output can be set.

Proteins can be entered either manually or loaded as single or multiple proteins from a FASTA formatted file. The names of the proteins will appear in the list and can be selectively removed by choosing and pressing the delete button.

The protease and modifications that can be chosen are very limited at present. This is due to our specific application. In case you want to add more options, have a look the developers guide or send me an email (tobias.straub_at_lmu.de).

A search can be performed after providing a list of peak masses. The easiest way to get the masses into Manuelito is to copy them from your preferred MS application (or MS Excel )and paste them into Manuelito (using either the paste button or right-clicking into the list panel). Note: Manuelito interprets localized number formats, that means that your preferred copy-from application (Excel) e.g. is running the same number format as your operating system.

After the search button is pressed a progress bar will indicate the progresson of the search. Matching peptides will appear in the right table. The results can be sorted by clicking into the respective header (each click reverts the current search order). Furthermore the results can be saved into a tab-delimited file (File->Export Results...) that can be openend with Excel. You can also print the result table using File->Print Results...

In order to overlap peptides from two different processing procedures, single results have to placed into an overlap set. This is done by calling Overlap->Edit Overlap.... You can then enter the current result peptides into one of the two slots available.

As soon as there are two result sets placed into two overlap slots, the overlap can be displayed (Overlap->Show Overlap...) The overlapping peptides can then be saved into a tab-delimited list.

It is furthermore possible to have Manuelito compute all possible peptides and their modification states that can be derived from a given set of peptides. In the 'List' tab, a mass range for the theoretical peptides can be entered. After pressing the 'list' button, all possibilities will be listed. They can be saved and exported as the search results.