cbroeckl / ramclustr Goto Github PK
View Code? Open in Web Editor NEWAssigning precursor-product ion relationships in indiscriminant MS/MS data
License: MIT License
Assigning precursor-product ion relationships in indiscriminant MS/MS data
License: MIT License
Currently, the MS1 data is copied into the slot for MS2 data if it is not present in the version that reads data from a csv, while it is kept empty when reading it from xcms
- should this be made the general case?
R CMD check RAMClustR
fails with:
Quitting from lines 19-77 (ramclustR.Rmd)
Error: processing vignette 'ramclustR.Rmd' failed with diagnostics:
argument is of length zero
Execution halted
I guess there is a package dependency issue
why it fails on my laptop. Can you spot which package
might be the culprit ?
> library(RAMClustR)
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 17.04
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RAMClustR_0.4
loaded via a namespace (and not attached):
[1] tools_3.3.2
As part of our wrapping of RAMClustR in Galaxy, we are routinely running a fairly basic test described below. After switching from 1.0.9 to 1.2.1 we are experiencing the following error:
Error in round(ramclustObj$clri[i], 2) :
non-numeric argument to mathematical function
Calls: store_output -> <Anonymous> -> paste0
Execution halted
Our store_output function calls the RAMClustR::write.msp
here
@hechth suspects that new version of RAMClustR does not expect a parameter to be missing
Trying to use the exportDataset
function from the library, I run into an error where the getData
function mentioned below is not found.
Lines 27 to 32 in eb15f37
Which function is this or is the exportDataset
function deprecated?
I'm very thankful for any advice.
We should have a tutorial style vignette
Hi, is there an intent to revitalize the CRAN package for this tool? The official page says it has been removed from CRAN and only archive with old versions is available.
https://cran.r-project.org/web/packages/RAMClustR/index.html
there should be a non-interactive mode, where the parameter sets
are not edited via the R data editor, but instead given e.g. as
ramclustR(..., paramset=paramsets$C8Serum, ...)
Hi, it would be great to have a working example snippet
on how to connect RAMclustR output to MetFamily.
What we need is an MSP file that looks like this:
https://raw.githubusercontent.com/ipb-halle/MetFamily/master/files/MSMS_library_showcase.msp
NAME: Unknown
RETENTIONTIME: 8.4209
PRECURSORMZ: 85.00465
METABOLITENAME:
ADDUCTIONNAME: [M-H]-
Num Peaks: 3
75.00941 44.5240135192871
79.95998 54.568229675293
85.00496 297.418823242188
i.e. where we have RT and Precursor that can be matched to the MS1 precursor.
Then we need the MS1 quantification as it comes from XCMS, looking like
https://github.com/ipb-halle/MetFamily/blob/master/files/Metabolite_profile_showcase.txt
Full spec of the MetFamily input formats are in
https://github.com/ipb-halle/MetFamily/blob/master/files/MetFamily_Input_Specification.pdf
@Treutler and @cbroeckl , is there a way to create a small HowTo
(some people at UC RIverside would love to see that ...)
so we can include that in RAMclustR documentation ?
@cbroeckl , is the file https://github.com/cbroeckl/RAMClustR/blob/master/vignettes/spectra/test.mspLib
what the RAMclustR output looks like ? It has the RT in a comment tag.
Yours,
Steffen
Hi,
I'm trying some worklows using ramclustR just very fast and I can't find my sample names in the output...
After take a look at the script of ramclustR function, I can see that you have a lot of results table containing all that you need (rt, intensity, cluster, etc...) But I can't find the sample names in the results MSP file (whereas the rownames of table are my sample names
Someone can help me please ?
Thanks a lot !
These paremeters are not used in ramclustR()
If supplying a feature table with more than 55k entries, RAMClustR fails due to this issue in the ff package ref. I doubt this issue in ff will be fixed.
Since this allocated matrix is symmetric (I assume at least), and only the upper triangle is computed anyway, I think this computation could maybe be optimized in order to never have to store the actual full matrix in memory.
@cbroeckl if you are currently busy and don't have the time to address this issue I'd be happy to support and we will come up with an implementation to solve this.
Line 667 in 351243d
@hechth - absolutely could be done. A few items to consider:
Originally posted by @cbroeckl in #31 (comment)
Hi Steffen,
RAMClustR uses ffmat<-ff(vmode="double", dim=c(n, n), initdata = 0) to call ff in ramclustR.R:263. ff obviously uses the default temp folder to store temporary files. In cases of limited space in the user profile, the call of ff may fail if no more empty disk space could be allocated. In a server based (HPC) environment this could happen, because user profile space is maybe limited and the working directory is allocated on scratch devices. We had a problem on Windows Server 2012 with the user profile located on C:/ and a quota of 10 GB and a working directory on scratch S:/.
I suggest to change the call of ff such that the temp files generated by ff are allocated in the working space and not in the system temp space.
A work around is to set the user system variables of the temp directory to a folder with enough space.
Yours,
Tobias
Hello,
Seems a recent commit (2438b68) is resulting in an error if normalize != "quantile" (ramclustR.R@461).
Thanks,
Rick
I just noticed that RAMClustR doesn't seem to have a license O.o
In the ramclustR.R
function it was possible to provide a csv
file as the input. It would be great to have this code now available in a separate function which loads the data from a feature csv and inits the pheno
object from a metadata csv.
The function would then return a ramClustObj
which has the data fields initiated in the same way as the rc.get.xcms.data
function.
The ramclust.R
file contains a function covering the whole workflow, but the rc.*.R
files actually contain the same functionality in multiple steps, which is more convenient to test and maintain.
ramclust.R
with the respective sub-steps of the workflowThanks @cbroeckl for all the help so far,
RC <- ramclustR(xcmsObj = xdata, ExpDes=experiment)
Error in if (!is.null(xcmsObj) & mslev == 2 & any(is.null(MStag), is.null(idMSMStag), :
missing value where TRUE/FALSE needed
In addition: Warning message:
In ramclustR(xcmsObj = xdata, ExpDes = experiment) :
NAs introduced by coercion
Is this because I filled in something incorrectly at
experiment <- defineExperiment(csv = FALSE)
?
Hello @cbroeckl and all,
I know I already asked about it, but want to double check.
I am working with Waters MS/MSe data, which I converted into mzML format keeping all three channels in, so my MS and MSMS data are contained in a single mzML file. Then I processed that with XCMS3
, which runs RT alignment and grouping on MS data, but then applies it to the MSMS layer as well.
Now I am trying to prepare my data for annotation using RAMClustR
. I read the data in using rc.get.xcms.data
, which asks for a name tag to the MSMS files, which I don't have since MSe data is in the same mzML files as the MS data. So I just skipped this parameter, which I think resulted in RAMClustR using only MS data for processing and preparing the spectra.
Is there a way around this? Or should I re-convert my data separating MS and MSe into different mzML files?
Thanks for all your help!
Best,
Lisa
Hi,
Sorry it's me again... !
I'm asking myself where the collapse option operaitons are used after ? https://github.com/cbroeckl/RAMClustR/blob/master/R/ramclustR.R#L784
Because the ramclustObj$SpecAbund
object looks not used after ? (or the ramclustObj$SpecAbundAve
Can you light me please ?
@arpita-007 - i think this is a rare event coupled with imperfect code. the file that fails has exactly 2000 features, which happens to be what the default blocksize setting is. try setting the option in the ramclustr function: blocksize = 1200. i suspect it will run fine. let me know if this fixes it please!
Originally posted by @cbroeckl in #29 (comment)
Hi,
I use the function writemsp to import the full dataset to a spectra object. I notice that the precursor that is given to each compound differ from the precursor calculate with the do.findmain.
In general the precursor is a mz higher, from the same ms1 group but lower intersity.
Hello,
Excellent work on RAMClustR. I've been running into an error when trying to use RAMClustR. Both data from apLCMS and the data provided with the package have caused the same error. I was attempting to use MS1 data only to deconvolute isotopes, in-source fragments and additional adducts. I've been running the following:
res_1<- ramclustR (xcmsObj = NULL, ms = "MSdata.csv",
idmsms = NULL,
taglocation = "filepaths",
MStag = NULL, idMSMStag = NULL, featdelim = "_", timepos = 2,
st = 20, sr = 0.5, maxt = 20, deepSplit = FALSE,
blocksize = 2000, mult = 5, hmax = 0.3, sampNameCol = 1,
collapse = TRUE, mspout = FALSE, mslev = 1, ExpDes = NULL,
normalize = "TIC", minModuleSize = 2, linkage="average")
The function will run through the following steps:
calculating ramclustR similarity: nblocks = 6
finished:1 2 3 4 5 6
RAMClust feature similarity matrix calculated and stored: 0.3 minutes
RAMClust distances converted to distance object: 0.1 minutes
fastcluster based clustering complete: 0 minutes
And then produce the following error:
Error in .subset2(x, i, exact = exact) : subscript out of bounds
My R session information is below.
Thank you in advance for your help.
R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] ff_2.2-13 bit_1.1-12 CAMERA_1.22.0
[4] igraph_1.0.1 BiocInstaller_1.16.5 dynamicTreeCut_1.62
[7] fastcluster_1.1.16 xcms_1.42.0 Biobase_2.26.0
[10] BiocGenerics_0.12.1 mzR_2.0.0 Rcpp_0.12.0
[13] RAMClustR_0.2 devtools_1.9.1
loaded via a namespace (and not attached):
[1] acepack_1.3-3.3 cluster_2.0.3 codetools_0.2-14
[4] colorspace_1.2-6 curl_0.9.1 digest_0.6.8
[7] foreign_0.8-65 Formula_1.2-1 ggplot2_1.0.1
[10] graph_1.44.1 grid_3.1.2 gridExtra_2.0.0
[13] gtable_0.1.2 Hmisc_3.16-0 httr_1.0.0
[16] lattice_0.20-33 latticeExtra_0.6-26 magrittr_1.5
[19] MASS_7.3-43 memoise_0.2.1 munsell_0.4.2
[22] nnet_7.3-10 plyr_1.8.3 proto_0.3-10
[25] R6_2.1.0 RBGL_1.42.0 RColorBrewer_1.1-2
[28] reshape2_1.4.1 rpart_4.1-10 scales_0.2.5
[31] splines_3.1.2 stats4_3.1.2 stringi_0.5-5
[34] stringr_1.0.0 survival_2.38-3 tcltk_3.1.2
[37] tools_3.1.2
Hi @meowcat and @Huansi, you both have forks of RAMClustR.
Please note that @cbroeckl and I are currently transferring
the RAMClustR repository to Corey where it belongs:
https://github.com/cbroeckl/RAMClustR/
You might have to fork afresh after the move.
Yours, Steffen
@zargham-ahmad - can we put all the general functions into one file? For example, create_ramclustObj() is a function created in rc.get.df.data, but is called from rc.get.csv.data and rc.get.xcms.data. I am sure there are other functions which are central and used by many files, and i would like to be able to find those central functions more easily.
Hi Corey,
Seems a tiny bug was recently introduced at
Line 567 in 47620bf
Thanks,
Rick
Since the main ramclustR.R file has become somewhat obsolete with the new individual components, it would be good to mark it as deprecated or even remove the functionality from the package after making sure that everything is kept where it actually should be. Another option would be to have this function as a default workflow running the fundamental steps of RAMClustR, so keeping it intact as a main wrapper @cbroeckl ?
To make sure that the functionality is kept or equivalent, we need a test case which runs the individual steps and then we can make a comparison to the results created by the old ramclustR
function.
RAMClustR has collected quite some dependencies and maybe some of them are by now outdated or no longer needed - I think it would make sense to go through the list of imported packages and see whether they are actually still required by the project @cbroeckl ?
The InterpretMSSpectrum
package is not available on OSx, therefore it should move to Suggests
and be imported via @concept
in the respective functions so that it is referenced - the code in do.findmain.R
should then only be executed if the package is present.
The BiocManager
, stringi
and xml2
packages are not imported, therefore should also be under suggests.
The package actually depends on R > 3.5.0, so it should also be noted in the package description.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.