A Combinatorial Perspective of the Protein Inference Problem


Source code:

The souce code of ProteinInfer can be downloaded at: proteininfer.zip.

The supplementary document is available at: supplementary.pdf.

The protein probabilities obtained by our method and ProteinProphet from four datasets are available at: probs.zip.



(1) Prerequisite: Our program is implemented in Java. The source code is provided in an Eclipse project. You can access the following link for the latest versions of Java and Eclipse:

Java (JDK):  http://www.oracle.com/technetwork/java/javase/downloads/index.html

Eclipse: http://www.eclipse.org/downloads/

TPP: We use X!Tandem, PeptideProphet, iProphet and ProteinProphet embedded in TPP to infer proteins in our experiments.The version we use is v4.5 and the homepage of TPP is: http://tools.proteomecenter.org/wiki/index.php?title=Software:TPP


(2) Usage:

The basic command is:

    java ProteinInfer.java --option value

Available options are as follows with "(*)" being compulsive parameters:

     --iprophet      String

    (*) The full path of the iProphet result file.

     --output_dir    String

    (*) The directory for saving results.

     --lambda_1      Integer

    The expected number of unique peptides of true proteins. When this value is not provided, an empirical value is estimated from data.

     --lambda_2       Integer

    The expected number of unique peptides of false proteins. When this value is not provided, the default value is 1.

     --print_subset   true | false

    Output subset proteins in a separate file. The subset proteins are output by default.


(3) Example:

   java ProteinInfer.java --iprophet e:\data\18mix\interact.iproph.pep.xml --output_dir e:\data\18mix