Giter Site home page Giter Site logo

bittremieux / gleams Goto Github PK

View Code? Open in Web Editor NEW
19.0 19.0 6.0 21.42 MB

GLEAMS is a Learned Embedding for Annotating Mass Spectra.

Home Page: https://doi.org/10.1101/483263

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
clustering deep-learning mass-spectrometry

gleams's People

Contributors

bittremieux avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

gleams's Issues

Some PSM was missing

I used the reference mgf with 1000 PSM, while it only output 968 embedding. Could you please share the information about the missing PSM?

OSError: SavedModel file does not exist at: /home/hp/miniconda3/envs/gleams/gleams/data/gleams_82c0124b.hdf5/{saved_model.pbtxt|saved_model.pb}

I was trying to execute gleams embed as it is provided in your instructions. However, I am facing the below OSError saying that the saved model in .pbtxt or .pb format is not present in the said path.

**2022-12-03 15:10:00,264 INFO [gleams/MainProcess] gleams.cli_embed : GLEAMS version 0.4.dev1+g8831ad6
2022-12-03 15:10:00,282 DEBUG [gleams/MainProcess] encoder.__init__ : Read the reference spectra
2022-12-03 15:10:00,907 DEBUG [gleams/MainProcess] encoder.__init__ : Select 500 valid reference spectra
2022-12-03 15:10:03,207 DEBUG [gleams/MainProcess] nn.embed : Load the stored GLEAMS neural network
2022-12-03 15:10:04,047 DEBUG [gleams/MainProcess] embedder.__init__ : Running the embedder model on 1 GPU(s)
Traceback (most recent call last):
  File "/home/hp/miniconda3/envs/gleams/bin/gleams", line 8, in <module>
    sys.exit(gleams())
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/gleams/gleams.py", line 97, in cli_embed
    nn.embed(metadata_filename, config.model_filename, f'{embed_name}.npy',
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/gleams/nn/nn.py", line 163, in embed
    emb.load()
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/gleams/nn/embedder.py", line 158, in load
    self.siamese_model = keras.models.load_model(self.filename)
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/tensorflow/python/keras/saving/save.py", line 186, in load_model
    loader_impl.parse_saved_model(filepath)
  File "/home/hp/miniconda3/envs/gleams/lib/python3.8/site-packages/tensorflow/python/saved_model/loader_impl.py", line 110, in parse_saved_model
    raise IOError("SavedModel file does not exist at: %s/{%s|%s}" %
OSError: SavedModel file does not exist at: /home/hp/miniconda3/envs/gleams/gleams/data/gleams_82c0124b.hdf5/{saved_model.pbtxt|saved_model.pb}

The error is because any package is missing in the conda environment where gleams has been installed?

--
Chinmaya

about MassIVE-KB datasets

Thanks for sharing this great work. I think GLEAMS is very helpful for processing the mass spectrum.
However, when I am checking the datasets used in GLEAMS, I cannot find how to download the 30 million high-quality PSMs of the MassIVE-KB spectral library and 185 million PSMs of the identification results for the full MassIVE-KB dataset in the MassIVE web site. Could you kindly help me about how to find these data?
Looking forward to your reply.

ValueError: invalid literal for int() with base 10

Hi I am running GLEAMS on a bunch of MGF files using gleams embed *.mgf --embed_name GLEAMS_embed with 1 gpu.

The job starts:

2023-10-18 23:08:17,623 INFO [gleams/MainProcess] gleams.cli_embed : GLEAMS version 0.4.dev7+g13ebc74.d20231018
2023-10-18 23:08:17,672 DEBUG [gleams/MainProcess] encoder.init : Read the reference spectra
2023-10-18 23:08:18,008 DEBUG [gleams/MainProcess] encoder.init : Select 500 valid reference spectra
2023-10-18 23:08:19,146 DEBUG [gleams/MainProcess] nn.embed : Load the stored GLEAMS neural network
2023-10-18 23:08:19,200 DEBUG [gleams/MainProcess] embedder.init : Running the embedder model on 1 GPU(s)
2023-10-18 23:08:19,660 INFO [gleams/MainProcess] nn.embed : Embed all peak files for metadata file /var/tmp/pbs.858407.hn-10-03/tmpy7q23kig/GLEAMS_embed.parquet
2023-10-18 23:08:19,662 INFO [gleams/MainProcess] nn.embed : Process dataset GLEAMS [ 1/ 1] (120 files)
2023-10-18 23:08:19,663 DEBUG [gleams/MainProcess] feature._peaks_to_features : Process file 202112249_TY_Nathaniel_1364_m2_3_FAIMS_OTIT_F1.mgf

but then encounters the following error:

Traceback (most recent call last):
  File "/data/petretto/home/e0470749/.conda/envs/gleams/bin/gleams", line 8, in <module>
    sys.exit(gleams())
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/gleams/gleams.py", line 97, in cli_embed
    nn.embed(metadata_filename, config.model_filename, f'{embed_name}.npy',
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/gleams/nn/nn.py", line 188, in embed
    for filename, file_scans, file_encodings in joblib.Parallel(
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/joblib/parallel.py", line 1041, in __call__
    if self.dispatch_one_batch(iterator):
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/joblib/parallel.py", line 859, in dispatch_one_batch
    self._dispatch(tasks)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/joblib/parallel.py", line 777, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/gleams/feature/feature.py", line 68, in _peaks_to_features
    scans['scan'] = scans['scan'].astype(np.int64)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/pandas/core/generic.py", line 5815, in astype
    new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 418, in astype
    return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
  File "/data/petretto/home/e0470749/.conda/envs/gleams/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 327, in apply

Clustering MS/MS spectra from multiple species

Hello,

I have installed the GLEAMS tool in one of my conda env. I am planning to check MS/MS clusters specific to data from different species. However, the general gleams command execution shows only embed and cluster options for mzML input files. It is not clear whether the embedding has to be done together for all the mzML files from all the species or it has to be done separately and clustered together at the "gleams cluster" step?.

Please help

Thanks,
Chinmaya

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.