pnnl-comp-mass-spec / informed-proteomics Goto Github PK
View Code? Open in Web Editor NEWTop down / bottom up, MS/MS analysis tool for DDA and DIA mass spectrometry data
Top down / bottom up, MS/MS analysis tool for DDA and DIA mass spectrometry data
The histone H3.1 I used contains a total of 3460 spectra, and the database human_proteome_database.fasta contains 20410 entries.
When I did not use the modified file, the result output 1310 PrSM, the parameters are as follows:
SpecFile 2DLC_H3_1.pbf
DatabaseFile human_proteome_database.fasta
FeatureFile 2DLC_H3_1.ms1ft
InternalCleavageMode SingleInternalCleavage
Tag-based search True
Tda Target+Decoy
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 300
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 30
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 0
When I use the modified file, only 59 PrSMs are output, and the parameters are as follows:
SpecFile 2DLC_H3_1.pbf
DatabaseFile human_proteome_database.fasta
FeatureFile 2DLC_H3_1.ms1ft
InternalCleavageMode SingleInternalCleavage
Tag-based search True
Tda Target+Decoy
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 500
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 50
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 4
Modification C(2) H(2) N(0) O(1) S(0),R,opt,Everywhere,Acetyl
Modification C(2) H(2) N(0) O(1) S(0),K,opt,Everywhere,Acetyl
Modification C(1) H(2) N(0) O(0) S(0),R,opt,Everywhere,Methyl
Modification C(1) H(2) N(0) O(0) S(0),K,opt,Everywhere,Methyl
Modification C(2) H(4) N(0) O(0) S(0),R,opt,Everywhere,Dimethyl
Modification C(2) H(4) N(0) O(0) S(0),K,opt,Everywhere,Dimethyl
Modification C(3) H(6) N(0) O(0) S(0),R,opt,Everywhere,Trimethyl
Modification C(0) H(1) N(0) O(3) S(0) P(1),S,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),T,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),Y,opt,Everywhere,Phospho
The modification file I am using is as follows:
NumMods=4
C2H2O1,RK,opt,any,Acetyl # Acetylation RK
CH2,RK,opt,any,Methyl # Methylation RK
C2H4,RK,opt,any,Dimethyl
C3H6,R,opt,any,Trimethyl
HO3P,STY,opt,any,Phospho # Phosphorylation STY
Is there a problem with my parameter settings, which leads to this situation?
Is there a normal event, only 59 prsm can be identified for such input?
The wiki documentation for running the tutorial has wrong file names for tutorial files, or wrong tutorial files.
The version I am using is Version 1.0.6619 and Version 1.0.7017.
When running ProMex, the RP4H_P32_WHIM2_biorep1_techrep3.ms1ft file is generated for MSPathFinderT input, but the RP4H_P32_WHIM2_biorep1_techrep3.ms1ft file is only 1KB, which may affect the identification performance of MSPathFinder.
What is the reason for this?
It occurs to me that the reported mass of the proteoforms are always ~9Da less than the mass calculated from the reported m/z and z.
In my knowledge, as I am using nESI as the ion source, the mass should be calculated as (m/z * z - z)
So as in the attached example,
for a precursor ion of 661m/z, charge = 23+, the mass should be 15180Da.
Yet, the reported proteoform mass was 15171.5.
The unmodified mass of my protein is 15143Da. Having K9Me1 and K27Me1, as MSpf reported, will make it 15171Da.
my search parameters are as follow:
'''
SpecFile W05C48_H3_UVPD.raw
DatabaseFile SOYBN_H3.fasta
FeatureFile W05C48_H3_UVPD.ms1ft
InternalCleavageMode NoInternalCleavage
Tag-based search False
Tda Target
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 500
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 50
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 5
Modification C(0) H(0) N(0) O(1) S(0),M,opt,Everywhere,Oxidation
Modification C(0) H(0) N(0) O(1) S(0),Y,opt,Everywhere,Oxidation
Modification C(0) H(0) N(0) O(1) S(0),C,opt,Everywhere,Oxidation
Modification C(0) H(0) N(0) O(1) S(0),K,opt,Everywhere,Oxidation
Modification C(0) H(1) N(0) O(3) S(0) P(1),S,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),T,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),Y,opt,Everywhere,Phospho
Modification C(2) H(2) N(0) O(1) S(0),K,opt,Everywhere,Acetyl
Modification C(1) H(2) N(0) O(0) S(0),K,opt,Everywhere,Methyl
Modification C(2) H(4) N(0) O(0) S(0),K,opt,Everywhere,Dimethyl
Modification C(3) H(6) N(0) O(0) S(0),K,opt,Everywhere,Trimethyl
'''
Which activation type is most suitable for EThcD fragmentation, which generates both c/z- and b/y-type product ions? Thank you very much.
When processing a set of files by specifying a directory to -s it would be helpful if the the process did not halt if a single raw file fails to run, e.g. due to the target or decoy file being empty. If this behaviour is preferred then an additional -skipOnFail argument could be added so that MSPathFinder continues to the next raw file.
Hi, I have a question about the feature mass determination in ProMex.
According to my understanding to the Informed-Proteomics publication,
(Park, J., Piehowski, P. D., Wilkins, C., Zhou, M., Mendoza, J., Fujimoto, G. M., Gibbons, B. C., Shaw, J. B., Shen, Y., Shukla, A. K., Moore, R. J., Liu, T., Petyuk, V. A., Tolić, N., Paša-Tolić, L., Smith, R. D., Payne, S. H., & Kim, S. (2017). Informed-Proteomics: Open-source software package for top-down proteomics. Nature Methods, 14(9), 909–914. https://doi.org/10.1038/nmeth.4388)
ProMex obtain features, by clustering isotopic envelopes across different charge states and LC elution time, for each monoisotopic mass with in mass range specified. So to my understanding, ProMex will iterate each possible monoisotopic mass within the mass range, and carry out the clustering to obtain feature information.
My question is, how refined is ProMex possible monoisotopic mass list? In my data, I noticed that ProMex could group near isobaric proteoforms, such as trimethylation VS acetylation, into the same feature, although they have slightly different monoisotopic mass.
I noticed that ProMex will divide the mass range into bins during analysis, and in the publication, it was said that a tolerance could be input in ProMex. So my guess is ProMex will divide the mass range into bins according to the tolerance, and then theoretical isotopic envelope will be generated for matching, using the averagine model of the bin mass. Therefore, for near isobaric proteoforms, their monoisotopic mass will fall into the same bin and will be grouped into the same feature with they co-elute.
Is my understanding correct?
Hello,
I'm trying to run PbfGen on this mzML file, and getting the error "An item with the same key has already been added". Could you help me understand why and how I might fix this?
https://www.dropbox.com/s/9y1of85g5ju9252/091817_mix3_dda_20min.mzML?dl=0
I created the mzML file using MSConvert from Waters RAW format.
Thanks,
Gabriel
How to use Agilent .d folder formats in PbfGen?I have installed ProteoWizard, and I tried to use msconvertGUI to convert the .d folder formats into .mzML files so that I can use it in PbfGen. However , when I use the previously created .pbf file in ProMex , it said Index out of range exception. How I can fix it?
Lactosylation / Hex(2) is treated as an unknown modification:
http://www.unimod.org/modifications_view.php?editid1=512
C12H20O10,RK,opt,any,Hex(2) # Lactostylation
Current output:
<SearchModification fixedMod="false" massDelta="324.105652" residues="R"> <cvParam cvRef="MS" accession="MS:1001460" name="unknown modification" value="Hex(2)" /> </SearchModification>
Expected output:
<SearchModification fixedMod="false" massDelta="324.105652" residues="R"> <cvParam cvRef="UNIMOD" accession="UNIMOD:512" name="Hex(2)" /> </SearchModification>
Is it possible to write the MS/MS sequence tag results to a file before running the scoring algorithms? I'm seeing many scans with sequence tags but end with 0 matches after the final scoring step. It would be useful to obtain the sequence tag information for MS2 scans, so one could determine why these are not significant matches (poor MS1 envelope similarity, low # of matching fragments, etc).
Input:
NumMods=28
HO3P,STY,opt,any,Phospho # Phosphorylation
O1,M,opt,any,Oxidation # Oxidation M
C12H20O10,RK,opt,any,Hex(2) # Lactostylation
C2H2O,K,opt,Prot-N-term,Acetyl # Acetylation Protein N-term (C2H2O can be replaced with "H(2) C(2) O")
On execute:
Exception parsing the file for parameter -mod: The given key was not present in the dictionary.
Exception while processing: The given key was not present in the dictionary.
at System.ThrowHelper.ThrowKeyNotFoundException()
at System.Collections.Generic.Dictionary2.get_Item(TKey key) at InformedProteomics.Backend.Data.Sequence.ModificationParams.GenerateModCombMap() at InformedProteomics.Backend.Data.Sequence.AminoAcidSet..ctor(IEnumerable
1 searchModifications, Int32 maxNumModsPerSequence)
at MSPathFinderT.TopDownInputParameters.LoadModsFile(String modFilePath)
at MSPathFinderT.TopDownInputParameters.Parse(Dictionary`2 parameters)
at MSPathFinderT.Program.Main(String[] args)
The algorithm may not be optimized for bottom-up data.
What possible problems or limitations if ProMex is used to extract bottom-up data?
Hi all,
I have acquired profile data (MS1 and MS2) on an Thermo instrument.
I have now tested the following two MSPathFinder piplines:
Both workflows run successfully and give a similar number of identifications. BUT the overlap of the identified proteoforms is only 45% (comparing sequences).
I am very unsure which results I can trust.
Looking forward to your feedback!
Cheers,
Konrad
Currently does not create an .mzid file if -tda 0 is used. Can this be altered so that an .mzid file is always produced?
Hi, all.
I have reproduced a problem when I run the 1.1.8305 release on my test file.
My script for execution looks like this:
"C:\Program Files\Informed-Proteomics-1.1.8305\ProMex.exe" -i 20151116_F_MaD_Ecolik12pool_ACNplate_ET5hcD10_01bis.raw
"C:\Program Files\Informed-Proteomics-1.1.8305\MSPathFinderT.exe" -i 20151116_F_MaD_Ecolik12pool_ACNplate_ET5hcD10_01bis.raw -d uniprot-proteome_UP000000625.fasta -ic 0 -mod IP-PTMs.txt -tda 0
(This RAW file isn't public, but any of the experiments from PXD019247 would probably be a fine replacement for it.)
My IP-PTMs.txt file looks like this:
NumMods=3
# Static
# C2H3NO,C,opt,any,Carbamidomethyl
# Variable
O1, M, opt, any, Oxidation
H-2, C, opt, any, Disulfide
C2H2O1,*, opt, Prot-N-term, Acetyl
The error output looks like the following:
Calculating spectral E-values for target-spectrum matches
Estimated matched sequences: 673
Processing, 0 proteins done, 0.0% complete, 1.4 sec elapsed
Processing, 54 proteins done, 8.0% complete, 17.2 sec elapsed
Processing, 104 proteins done, 15.5% complete, 32.9 sec elapsed
Processing, 152 proteins done, 22.6% complete, 48.0 sec elapsed
Processing, 195 proteins done, 29.0% complete, 63.0 sec elapsed
Processing, 243 proteins done, 36.1% complete, 78.2 sec elapsed
Processing, 294 proteins done, 43.7% complete, 94.1 sec elapsed
Processing, 341 proteins done, 50.7% complete, 109.4 sec elapsed
Processing, 434 proteins done, 64.5% complete, 140.7 sec elapsed
Processing, 504 proteins done, 74.9% complete, 170.8 sec elapsed
Processing, 572 proteins done, 85.0% complete, 200.8 sec elapsed
Processing, 646 proteins done, 96.0% complete, 231.0 sec elapsed
Total Progress: 53.33%, 0d 0h 5.00m elapsed, Current Task: Calculating spectral E-values for target-spectrum matches, estimated remaining: 0d 0h 4.37m
Target-spectrum match E-value calculation elapsed Time: 262.8 sec
Exception while processing: Sequence contains no elements
Exception while processing: Sequence contains no elements
Stack trace: MSPathFinderT.Program.ProcessFiles-:-InformedProteomics.TopDown.Execution.IcTopDownLauncher.RunSearch-:-InformedProteomics.TopDown.Execution.MzidResultsWriter.WriteResultsToMzid-:-InformedProteomics.TopDown.Execution.MzidResultsWriter.CreateMzidSettings-:-System.Linq.Enumerable.First[TSource]
Warning: scores is null for index 0 in scan 1448
Warning: scores is null for index 0 in scan 1448
Warning: scores is null for index 0 in scan 1479
Warning: scores is null for index 0 in scan 1479
PBFGen.exe -s 2DLC_H3_1.mzML > MSpathfinderResult201902221_search2.log
ProMex.exe -i 2DLC_H3_1.pbf -minCharge 2 -score n -csv n > MSpathfinderResult201902222_search2.log
Complete MS1 feature extraction.
I am currently trying to create some customized XICs which would need the retention time and intensities of all the precursors in all the PrSM. Is there any way to batch export these data? I know I can see the XIC of PrSMs through the LCMSSpectator, but it does not allow me to export the data points. Even if it does, I will have to do it one by one.
I am thinking that the required information maybe stored in the pbf file, but I do not know how to parse it.
The .param file appears not to shown the specified -act which was specified, likewise as far as I can see it also isn't shown in the .mzid file.
It would handy specifying it both of these for results validation purposes. Certainly would be good to know what was actually chosen when the default -act 6 is used.
I read the paper "Informed-Proteomics: open-source software package for top-down proteomics", which mentions that MSPathFinder is a faster proteoform identification tool. But I don't know where I am operating, which makes the identification speed slower. First, I used the msconvert tool in the ProteoWizard package to convert the original spectral file to the .mzML file format. Then experimented with the following parameters, MSPathFinderT.exe running total time is 8d8h12m.
SpecFile 2DLC_H3_1.pbf
DatabaseFile human_proteome_database.fasta
FeatureFile 2DLC_H3_1.ms1ft
InternalCleavageMode SingleInternalCleavage
Tag-based search True
Tda Target+Decoy
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 300
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 30
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 0
When I used the default parameters below and added a modified file, the experiment speed became slower. Running 0d 22h 35.02m only ran 0.4%. I want to know where the problem is?
SpecFile 2DLC_H3_1.pbf
DatabaseFile human_proteome_database.fasta
FeatureFile 2DLC_H3_1.ms1ft
InternalCleavageMode SingleInternalCleavage
Tag-based search True
Tda Target+Decoy
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 500
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 50
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 4
Modification C(2) H(2) N(0) O(1) S(0),R,opt,Everywhere,Acetyl
Modification C(2) H(2) N(0) O(1) S(0),K,opt,Everywhere,Acetyl
Modification C(1) H(2) N(0) O(0) S(0),R,opt,Everywhere,Methyl
Modification C(1) H(2) N(0) O(0) S(0),K,opt,Everywhere,Methyl
Modification C(2) H(4) N(0) O(0) S(0),R,opt,Everywhere,Dimethyl
Modification C(2) H(4) N(0) O(0) S(0),K,opt,Everywhere,Dimethyl
Modification C(3) H(6) N(0) O(0) S(0),R,opt,Everywhere,Trimethyl
Modification C(0) H(1) N(0) O(3) S(0) P(1),S,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),T,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),Y,opt,Everywhere,Phospho
Processing, 93499 proteins done, 0.4% complete, 80266.5 sec elapsed
Total Progress: 42.58%, 0d 22h 20.02m elapsed, Current Task: Searching the targe
t database
Processing, 93950 proteins done, 0.4% complete, 80566.8 sec elapsed
Total Progress: 42.58%, 0d 22h 25.02m elapsed, Current Task: Searching the targe
t database
Processing, 94352 proteins done, 0.4% complete, 80881.6 sec elapsed
Total Progress: 42.58%, 0d 22h 30.02m elapsed, Current Task: Searching the targe
t database
Processing, 94955 proteins done, 0.4% complete, 81189.5 sec elapsed
Total Progress: 42.58%, 0d 22h 35.02m elapsed, Current Task: Searching the targe
t database
Another problem is that the. fasta file I used contains 20410 entries, why the search shows that 94352 proteins done?
Hi, When asking promex to write csv file, I am getting "System.Int32[]" in every cell in FeatureID field.
best
Artur Pirog
Hi,
I am pretty sure I am doing something wrong, but I would like to know how to install Informed-Proteomics (and LCMS-Spectator) on Windows.
I have downloaded Informed-Proteomics-master and LCM-Spectator-master from GitHub, but I was not able to install both. I downloaded Inno Setup to compile .ISS files, but both installers gave me an error "The [Setup] section must include an AppVersion or AppVerName directive". These errors were corrected by removing a ; sign (line 98 for LCMS-Spectator, line 110 for Informed-Proteomics). Then when trying to compile again, another error shows about not finding a file at BIN folder (LcmsSpectator\bin\Release\LcmsSpectator.exe;). For Informed-Proteomics the error is about a .DLL not found again in the BIN folder. I think the problem is that there is no BIN folder in the .ZIP from GitHub. Could you help me to install these softwares? Thanks a lot.
Hey,
I'm trying to implement MSPathFinderT in our lab, and having difficulties to establish a pbf file which contain the MS1 and MS2 files, I managed to turn the raw files using pbfgen, but when I try to use promax or mspathfindert I get an error which says that "Datafile has no MS1 spectra".
I'm sorry for the basic question, couldn't find any solution in the tutorial though.
Thanks in advance.
Hi all,
I have a question about filtering MSPathFinder results after target/decoy search.
The PrSMs in the *lcTda.tsv-file are propably not filtered for a certain FDR. So I have to do it afterwards.
I am now wondering what value I should use: Whats the difference between QValue and PepQValue?
And if I want a final FDR of 1%, do I just exclude all PrSMs with a higher value than 0.01?
Thanks a lot!
Cheers,
Konrad
I have been using Informed-Proteomics to analysis Histone Top-down spectrums, and as you may already know, there are a lot of near-isobaric or isobaric proteoforms.
My question is, is MsPathFinderT able to report multiple proteoforms from a MS2 spectrum? According to my understanding to the Informed-Proteomics article and my application of this software, it would only report one best PrSM per one MS2 spectrum. In other words, if multiple (near-) isobaric proteoforms were co-isolated and co-fragmented in a MS2 scan, only the one with the best score will be reported.
Am I right about this?
Is there a way to run Promex on a subset of a .pbf file? Currently I have to generate two different .pbf files to accomplish this.
There are some issues in using MSPathfinderT to search thermo .raw EThcD data.
I understand that Informed Proteomics would handle EThcD spectra as ETD spectra, for ETD is the major driving fragmentation force in EThcD.
Yet, I found out that when thermo .raw file is used to generate the .pbf data, MSPathfinderT would automatically assign HCD to EThcD spectra and it would only handle the b y ions. For these ions were not the majority, most of the PrSM had bad scores.
I have tried to input -act ETD (i.e. specify the activation method is ETD) in the command line, but the same problem will still be resulted. (Search result and raw file here)
I pondered that this problem may due to the .raw format. Therefore, I tried to first convert the .raw file to .mzml then use it to generate a .pbf to do the search again. This time it had worked. MSPathfinderT were able to handle the MS2 spectra as ETD spectra (i.e. search c z ions), with or without my specification of -act ETD. (Search result and mzml file here)
Another Issue is that the IcTda or IcTarget #matched fragment ions does not seem to match with that reported in LCMSSpectator. For example in this PrSM, it was reported that the #Matching ions is 74.
Nonetheless, as reported in the LCMSSpectator, there should only be 68 ions. In some cases, there were more ions reported in LCMSSpectator than in the IcTda or IcTarget
Lastly, since MSPathfinderT can now support UVPD, which has even more ion types than EThcD, would MSPathFinderT able to support EThcD (i.e. to search for all b y c z ions) in the near future? I know it is quite a bit to ask but in the case of highly modified proteins like histones, thorough fragment information is quite important to locate the PTMs and sequence variants. So adding this function would definitely help this field a lot.
I want to add some modifications to the mod.txt file. Can you point me to the guidance how to do that? e.g. what is the proper format of the text in the .txt file, and how do I specify the number of atoms for each element?
Thanks!
Hello,
MSPathFinderT.exe crashes issuing the following error:
what could have gone wrong?
Total Progress: 94.97%, 2d 14h 45.49m elapsed, Current Task:
Collected candidate matches: 0
Decoy database search elapsed Time: 109057.9 sec
Calculating spectral E-values for decoy-spectrum matches
Estimated matched sequences: 0
Decoy-spectrum match E-value calculation elapsed Time: 0.0 sec
Error computing FDR: Cannot compute FDR Scores; target file is empty
Error processing Frac4_140kR_01.raw: Cannot compute FDR Scores; target file is empty
C:\Users\Neal\Documents>MSPathFinderT.exe -s D:\10_26data\Chem2-2-1.raw -d D:\10_26data\all.fasta -o D:\10_26data\new -t 10 -f 10 -m 1 -tda 1 -minLength 21 -maxLength 300 -minCharge 2 -maxCharge 30 -minFragCharge 1 -maxFragCharge 15 -minMass 3000 -maxMass 50000 -mod D:\10_26data\MSPathFinder_Mods.txt
MSPathFinderT version 1.0.6510 (Oct. 28, 2017)
MaxThreads: 6
SpectrumFilePath: D:\10_26data\Chem2-2-1.raw
DatabaseFilePath: D:\10_26data\all.fasta
FeatureFilePath: N/A
OutputDir: D:\10_26data\new
InternalCleavageMode: SingleInternalCleavage
Tag-based search: True
Tda: Target+Decoy
PrecursorIonTolerancePpm: 10
ProductIonTolerancePpm: 10
MinSequenceLength: 21
MaxSequenceLength: 300
MinPrecursorIonCharge: 2
MaxPrecursorIonCharge: 30
MinProductIonCharge: 1
MaxProductIonCharge: 15
MinSequenceMass: 3000
MaxSequenceMass: 50000
MaxDynamicModificationsPerSequence: 4
Modifications:
C(0) H(0) N(0) O(1) S(0),M,opt,Everywhere,Oxidation
C(0) H(-1) N(0) O(0) S(0),C,opt,Everywhere,Dehydro
C(2) H(2) N(0) O(1) S(0),*,opt,ProteinNTerm,Acetyl
Creating and loading pbf file...
Total Progress: 0.00%, 0d 0h 0.00m elapsed, Current Task: Reading spectra file
Elapsed Time: 5.2 sec
Reading Fasta File
Generating D:\10_26data\all.icseq and
Generating D:\10_26data\all.icanno ... Done
Reading ProMex results...
3155/3155 features loaded...Elapsed Time: 2.0 sec
Generating deconvoluted spectra for MS/MS spectra...
Elapsed Time: 0.0 sec
Generating sequence tags for MS/MS spectra...
Number of spectra: 0
Generated sequence tags: 0
Elapsed Time: 0.0 sec
Caching peaks in MS1 spectra: 25329 scans
Sorting MS1 peaks: 54,689,404 peaks
Reading the target database...
Elapsed Time: 0.3 sec
Tag-based searching the target database
Number of spectra containing sequence tags: 0
Collected candidate matches: 0
Target database tag-based search elapsed Time: 35.5 sec
Searching the target database
Generating D:\10_26data\all.icplcp ... Done
Estimated Sequences: 37,834,838
Processing, 0 proteins done, 0.0% complete, 0.0 sec elapsed
Processing, 106315 proteins done, 0.3% complete, 15.1 sec elapsed
Processing, 118028 proteins done, 0.3% complete, 30.1 sec elapsed
Processing, 216527 proteins done, 0.6% complete, 45.1 sec elapsed
Processing, 323031 proteins done, 0.9% complete, 60.1 sec elapsed
Processing, 427459 proteins done, 1.1% complete, 75.1 sec elapsed
Total Progress: 42.70%, 0d 0h 5.00m elapsed, Current Task: Searching the target database
Processing, 532183 proteins done, 1.4% complete, 90.1 sec elapsed
Processing, 637148 proteins done, 1.7% complete, 105.1 sec elapsed
Processing, 768883 proteins done, 2.0% complete, 135.1 sec elapsed
Processing, 953651 proteins done, 2.5% complete, 165.1 sec elapsed
Processing, 1101862 proteins done, 2.9% complete, 195.9 sec elapsed
Processing, 1285452 proteins done, 3.4% complete, 225.9 sec elapsed
Processing, 1422493 proteins done, 3.8% complete, 255.9 sec elapsed
Processing, 1510383 proteins done, 4.0% complete, 285.9 sec elapsed
Processing, 1672899 proteins done, 4.4% complete, 346.0 sec elapsed
Total Progress: 43.36%, 0d 0h 10.00m elapsed, Current Task: Searching the target database
Processing, 2041036 proteins done, 5.4% complete, 406.0 sec elapsed
Processing, 2368429 proteins done, 6.3% complete, 466.0 sec elapsed
Processing, 2683408 proteins done, 7.1% complete, 527.5 sec elapsed
Processing, 2991146 proteins done, 7.9% complete, 587.5 sec elapsed
Processing, 3309509 proteins done, 8.7% complete, 647.5 sec elapsed
Total Progress: 44.12%, 0d 0h 15.04m elapsed, Current Task: Searching the target database
Processing, 3681335 proteins done, 9.7% complete, 707.5 sec elapsed
Processing, 4036809 proteins done, 10.7% complete, 767.5 sec elapsed
Processing, 4308919 proteins done, 11.4% complete, 827.6 sec elapsed
Processing, 4548002 proteins done, 12.0% complete, 887.8 sec elapsed
Processing, 4752700 proteins done, 12.6% complete, 947.8 sec elapsed
Total Progress: 44.75%, 0d 0h 20.09m elapsed, Current Task: Searching the target database
Processing, 4939771 proteins done, 13.1% complete, 1007.8 sec elapsed
Processing, 4939772 proteins done, 13.1% complete, 1007.8 sec elapsed
Processing, 4939773 proteins done, 13.1% complete, 1007.8 sec elapsed
Processing, 5122628 proteins done, 13.5% complete, 1069.0 sec elapsed
Processing, 5122627 proteins done, 13.5% complete, 1069.0 sec elapsed
Processing, 5328186 proteins done, 14.1% complete, 1129.0 sec elapsed
Processing, 5535302 proteins done, 14.6% complete, 1191.3 sec elapsed
Total Progress: 45.21%, 0d 0h 25.09m elapsed, Current Task: Searching the target database
Processing, 6754622 proteins done, 17.9% complete, 1492.6 sec elapsed
Total Progress: 45.77%, 0d 0h 30.09m elapsed, Current Task: Searching the target database
Processing, 7896082 proteins done, 20.9% complete, 1792.6 sec elapsed
Total Progress: 46.38%, 0d 0h 35.14m elapsed, Current Task: Searching the target database
Processing, 9470041 proteins done, 25.0% complete, 2092.6 sec elapsed
Total Progress: 47.03%, 0d 0h 40.14m elapsed, Current Task: Searching the target database
Processing, 10584900 proteins done, 28.0% complete, 2393.7 sec elapsed
Total Progress: 47.50%, 0d 0h 45.19m elapsed, Current Task: Searching the target database
Processing, 11487978 proteins done, 30.4% complete, 2694.0 sec elapsed
Total Progress: 47.98%, 0d 0h 50.21m elapsed, Current Task: Searching the target database
Processing, 12217755 proteins done, 32.3% complete, 2994.0 sec elapsed
Total Progress: 48.23%, 0d 0h 55.22m elapsed, Current Task: Searching the target database
Processing, 12725499 proteins done, 33.6% complete, 3296.6 sec elapsed
Total Progress: 48.39%, 0d 1h 0.26m elapsed, Current Task: Searching the target database
Total Progress: 48.39%, 0d 1h 5.27m elapsed, Current Task: Searching the target database
Total Progress: 48.39%, 0d 1h 10.29m elapsed, Current Task: Searching the target database
Total Progress: 48.39%, 0d 1h 15.31m elapsed, Current Task: Searching the target database
Total Progress: 48.39%, 0d 1h 20.32m elapsed, Current Task: Searching the target database
Total Progress: 48.39%, 0d 1h 25.34m elapsed, Current Task: Searching the target database
Processing, 12725500 proteins done, 33.6% complete, 5000.6 sec elapsed
Total Progress: 48.84%, 0d 1h 30.34m elapsed, Current Task: Searching the target database
Processing, 14227281 proteins done, 37.6% complete, 5300.6 sec elapsed
Total Progress: 49.45%, 0d 1h 35.38m elapsed, Current Task: Searching the target database
Processing, 15403529 proteins done, 40.7% complete, 5600.7 sec elapsed
Total Progress: 49.87%, 0d 1h 40.38m elapsed, Current Task: Searching the target database
Processing, 16221767 proteins done, 42.9% complete, 5900.7 sec elapsed
Total Progress: 50.28%, 0d 1h 45.39m elapsed, Current Task: Searching the target database
Processing, 17104889 proteins done, 45.2% complete, 6203.2 sec elapsed
Total Progress: 50.69%, 0d 1h 50.39m elapsed, Current Task: Searching the target database
Processing, 18226901 proteins done, 48.2% complete, 6505.3 sec elapsed
Total Progress: 51.20%, 0d 1h 55.39m elapsed, Current Task: Searching the target database
Processing, 19098473 proteins done, 50.5% complete, 6806.7 sec elapsed
Total Progress: 51.62%, 0d 2h 0.42m elapsed, Current Task: Searching the target database
Processing, 20256051 proteins done, 53.5% complete, 7106.7 sec elapsed
Total Progress: 52.17%, 0d 2h 5.42m elapsed, Current Task: Searching the target database
Processing, 21211621 proteins done, 56.1% complete, 7406.7 sec elapsed
Total Progress: 52.57%, 0d 2h 10.43m elapsed, Current Task: Searching the target database
Processing, 22091816 proteins done, 58.4% complete, 7708.7 sec elapsed
Total Progress: 53.12%, 0d 2h 15.48m elapsed, Current Task: Searching the target database
Processing, 23624794 proteins done, 62.4% complete, 8008.7 sec elapsed
Total Progress: 53.93%, 0d 2h 20.48m elapsed, Current Task: Searching the target database
Processing, 25145852 proteins done, 66.5% complete, 8311.5 sec elapsed
Total Progress: 54.50%, 0d 2h 25.51m elapsed, Current Task: Searching the target database
Processing, 26387047 proteins done, 69.7% complete, 8613.3 sec elapsed
Total Progress: 55.08%, 0d 2h 30.54m elapsed, Current Task: Searching the target database
Processing, 27652102 proteins done, 73.1% complete, 8913.3 sec elapsed
Total Progress: 55.66%, 0d 2h 35.54m elapsed, Current Task: Searching the target database
Processing, 28895951 proteins done, 76.4% complete, 9215.1 sec elapsed
Total Progress: 56.24%, 0d 2h 40.58m elapsed, Current Task: Searching the target database
Processing, 30139855 proteins done, 79.7% complete, 9515.2 sec elapsed
Total Progress: 56.80%, 0d 2h 45.59m elapsed, Current Task: Searching the target database
Processing, 31342528 proteins done, 82.8% complete, 9815.2 sec elapsed
Total Progress: 57.34%, 0d 2h 50.59m elapsed, Current Task: Searching the target database
Processing, 32435944 proteins done, 85.7% complete, 10117.8 sec elapsed
Total Progress: 57.87%, 0d 2h 55.59m elapsed, Current Task: Searching the target database
Processing, 33930982 proteins done, 89.7% complete, 10417.8 sec elapsed
Total Progress: 58.68%, 0d 3h 0.61m elapsed, Current Task: Searching the target database
Processing, 35360872 proteins done, 93.5% complete, 10717.8 sec elapsed
Total Progress: 59.19%, 0d 3h 5.62m elapsed, Current Task: Searching the target database
Processing, 36518054 proteins done, 96.5% complete, 11017.8 sec elapsed
Total Progress: 59.74%, 0d 3h 10.67m elapsed, Current Task: Searching the target database
Processing, 37390107 proteins done, 98.8% complete, 11319.5 sec elapsed
Collected candidate matches: 0
Target database search elapsed Time: 11399.8 sec
Calculating spectral E-values for target-spectrum matches
Estimated matched sequences: 0
Target-spectrum match E-value calculation elapsed Time: 0.1 sec
Creating D:\10_26data\all.icsfldecoy.fasta
Generating D:\10_26data\all.icsfldecoy.icseq and
Generating D:\10_26data\all.icsfldecoy.icanno ... Done
Reading the decoy database...
Elapsed Time: 0.5 sec
Tag-based searching the decoy database
Number of spectra containing sequence tags: 0
Collected candidate matches: 0
Decoy database tag-based search elapsed Time: 12.1 sec
Searching the decoy database
Generating D:\10_26data\all.icsfldecoy.icplcp ... Done
Estimated Sequences: 37,834,838
Processing, 0 proteins done, 0.0% complete, 0.0 sec elapsed
Processing, 91058 proteins done, 0.2% complete, 15.0 sec elapsed
Processing, 104777 proteins done, 0.3% complete, 36.5 sec elapsed
Processing, 104775 proteins done, 0.3% complete, 36.5 sec elapsed
Processing, 104780 proteins done, 0.3% complete, 36.5 sec elapsed
Processing, 199050 proteins done, 0.5% complete, 51.5 sec elapsed
Processing, 299404 proteins done, 0.8% complete, 66.5 sec elapsed
Processing, 399679 proteins done, 1.1% complete, 81.6 sec elapsed
Processing, 473324 proteins done, 1.3% complete, 99.3 sec elapsed
Total Progress: 77.72%, 0d 3h 15.70m elapsed, Current Task: Searching the decoy database
Processing, 473827 proteins done, 1.3% complete, 116.2 sec elapsed
Processing, 473826 proteins done, 1.3% complete, 116.2 sec elapsed
Processing, 473828 proteins done, 1.3% complete, 116.2 sec elapsed
Processing, 474920 proteins done, 1.3% complete, 147.2 sec elapsed
Processing, 576058 proteins done, 1.5% complete, 177.2 sec elapsed
Processing, 776854 proteins done, 2.1% complete, 207.2 sec elapsed
Processing, 978487 proteins done, 2.6% complete, 237.2 sec elapsed
Processing, 1122059 proteins done, 3.0% complete, 267.2 sec elapsed
Processing, 1322696 proteins done, 3.5% complete, 297.2 sec elapsed
Processing, 1707488 proteins done, 4.5% complete, 357.2 sec elapsed
Total Progress: 78.42%, 0d 3h 20.70m elapsed, Current Task: Searching the decoy database
Processing, 2026830 proteins done, 5.4% complete, 417.3 sec elapsed
Processing, 2367141 proteins done, 6.3% complete, 478.8 sec elapsed
Processing, 2721653 proteins done, 7.2% complete, 538.8 sec elapsed
Processing, 3082872 proteins done, 8.1% complete, 598.8 sec elapsed
Processing, 3434130 proteins done, 9.1% complete, 658.8 sec elapsed
Total Progress: 79.22%, 0d 3h 25.70m elapsed, Current Task: Searching the decoy database
Processing, 3745457 proteins done, 9.9% complete, 718.8 sec elapsed
Processing, 3996486 proteins done, 10.6% complete, 778.8 sec elapsed
Processing, 4350716 proteins done, 11.5% complete, 838.9 sec elapsed
Processing, 4716227 proteins done, 12.5% complete, 904.6 sec elapsed
Processing, 4716229 proteins done, 12.5% complete, 904.6 sec elapsed
Processing, 5132499 proteins done, 13.6% complete, 964.6 sec elapsed
Total Progress: 80.00%, 0d 3h 30.71m elapsed, Current Task: Searching the decoy database
Processing, 5520902 proteins done, 14.6% complete, 1024.6 sec elapsed
Processing, 6057630 proteins done, 16.0% complete, 1084.6 sec elapsed
Processing, 6560208 proteins done, 17.3% complete, 1144.6 sec elapsed
Total Progress: 81.24%, 0d 3h 35.71m elapsed, Current Task: Searching the decoy database
Processing, 9229408 proteins done, 24.4% complete, 1444.6 sec elapsed
Total Progress: 82.43%, 0d 3h 40.71m elapsed, Current Task: Searching the decoy database
Processing, 11711076 proteins done, 31.0% complete, 1744.7 sec elapsed
Total Progress: 83.50%, 0d 3h 45.71m elapsed, Current Task: Searching the decoy database
Processing, 13778610 proteins done, 36.4% complete, 2044.7 sec elapsed
Total Progress: 84.34%, 0d 3h 50.71m elapsed, Current Task: Searching the decoy database
Processing, 15581306 proteins done, 41.2% complete, 2344.7 sec elapsed
Total Progress: 85.13%, 0d 3h 55.71m elapsed, Current Task: Searching the decoy database
Processing, 17190416 proteins done, 45.4% complete, 2644.7 sec elapsed
Total Progress: 85.85%, 0d 4h 0.71m elapsed, Current Task: Searching the decoy database
Processing, 18745544 proteins done, 49.5% complete, 2944.7 sec elapsed
Total Progress: 86.40%, 0d 4h 5.71m elapsed, Current Task: Searching the decoy database
Processing, 19900230 proteins done, 52.6% complete, 3244.7 sec elapsed
Total Progress: 87.10%, 0d 4h 10.71m elapsed, Current Task: Searching the decoy database
Processing, 21381784 proteins done, 56.5% complete, 3546.7 sec elapsed
Total Progress: 87.72%, 0d 4h 15.76m elapsed, Current Task: Searching the decoy database
Processing, 22653945 proteins done, 59.9% complete, 3846.7 sec elapsed
Total Progress: 88.33%, 0d 4h 20.76m elapsed, Current Task: Searching the decoy database
Processing, 24074766 proteins done, 63.6% complete, 4147.1 sec elapsed
Total Progress: 88.99%, 0d 4h 25.76m elapsed, Current Task: Searching the decoy database
Processing, 25388499 proteins done, 67.1% complete, 4447.1 sec elapsed
Total Progress: 89.56%, 0d 4h 30.81m elapsed, Current Task: Searching the decoy database
Processing, 26602039 proteins done, 70.3% complete, 4748.9 sec elapsed
Total Progress: 90.13%, 0d 4h 35.85m elapsed, Current Task: Searching the decoy database
Processing, 27836037 proteins done, 73.6% complete, 5050.5 sec elapsed
Total Progress: 90.70%, 0d 4h 40.86m elapsed, Current Task: Searching the decoy database
Processing, 29086254 proteins done, 76.9% complete, 5351.3 sec elapsed
Total Progress: 91.28%, 0d 4h 45.88m elapsed, Current Task: Searching the decoy database
Processing, 30335458 proteins done, 80.2% complete, 5651.3 sec elapsed
Total Progress: 91.84%, 0d 4h 50.88m elapsed, Current Task: Searching the decoy database
Processing, 31534228 proteins done, 83.3% complete, 5951.9 sec elapsed
Processing, 31534229 proteins done, 83.3% complete, 5951.9 sec elapsed
Total Progress: 92.36%, 0d 4h 55.91m elapsed, Current Task: Searching the decoy database
Processing, 32562446 proteins done, 86.1% complete, 6254.3 sec elapsed
Total Progress: 92.87%, 0d 5h 0.93m elapsed, Current Task: Searching the decoy database
Processing, 33775607 proteins done, 89.3% complete, 6557.1 sec elapsed
Total Progress: 93.44%, 0d 5h 5.98m elapsed, Current Task: Searching the decoy database
Processing, 34961566 proteins done, 92.4% complete, 6858.7 sec elapsed
Total Progress: 93.96%, 0d 5h 11.00m elapsed, Current Task: Searching the decoy database
Processing, 36073780 proteins done, 95.3% complete, 7160.8 sec elapsed
Total Progress: 94.51%, 0d 5h 16.05m elapsed, Current Task: Searching the decoy database
Processing, 37283875 proteins done, 98.5% complete, 7460.8 sec elapsed
Collected candidate matches: 0
Decoy database search elapsed Time: 7626.0 sec
Calculating spectral E-values for decoy-spectrum matches
Estimated matched sequences: 0
Decoy-spectrum match E-value calculation elapsed Time: 0.0 sec
Warning: Error computing FDR: Cannot compute FDR Scores; target file is empty
Error processing Chem2-2-1.raw: Cannot compute FDR Scores; target file is empty
After running mspathfindert, I see that whole mass and charge range is searched, not only limited by minimum mass or charge. Can I somehow limit this, or is it a bug in the software? I would like to shorten running time.
Hello, it would be useful to output deconvoluted mass spectra in mzml format.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.