ahmohamed / lipidr Goto Github PK

View Code? Open in Web Editor NEW

26.0 5.0 13.0 51.83 MB

Data Mining and Analysis of Lipidomics datasets in R

Home Page: https://www.lipidr.org/

License: Other

R 99.85% CSS 0.02% Dockerfile 0.13%

bioconductor r lipidomics

lipidr's Introduction

lipidr: Data Mining and Analysis of Lipidomics Datasets in R

See full guide at lipidr.org

Overall workflow

Input

Numerical Matrix

To use lipidr for your analysis using numerical matrix as input, you need 2 files:

Numerical table where lipids are rows and samples are columns. Lipid names should be in the first column, and sample names are in the first row. (see example here)
A table with the sample annotation / groups, where the sample names are in first column. Note the sample names must be identical in the two files. (see example here)

lipidr can convert these 2 files to LipidomicsExperiment as follows:

d <- as_lipidomics_experiment(read.csv("data_matrix.csv"))
d <- add_sample_annotation(d, "data_clin.csv")

Export from Skyline

Here lipidr also requires 2 files:

Results exported from Skyline as CSV file (see image below). (see example here)
A table / CSV file with the sample annotation / groups, where the sample names are in first column. Note the sample names must be identical in the two files. (see example here)

In lipidr:

d <- read_skyline("Skyline_export.csv")
d <- add_sample_annotation(d, "data_clin.csv")

LipidomicsExperiment Object

lipidr represents lipidomics datasets as a LipidomicsExperiment, which extends SummarizedExperiment, to facilitate integration with other Bioconductor packages.

Quality control & plotting

lipidr generates various plots, such as box plots or PCA, for quality control of samples and measured lipids. Lipids can be filtered by their %CV. Normalization methods with and without internal standards are also supported.

Univariate Analysis

Univariate analysis can be performed using any of the loaded clinical variables, which can be readily visualized as volcano plots. Multi-group comparisons and adjusting for confounding variables is also supported (refer to examples on www.lipidr.org). A novel lipid set enrichment analysis is implemented to detect preferential regulation of certain lipid classes, total chain lengths or unsaturation patterns. Plots for visualization of enrichment results are also implemented.

Multivariate Analysis

lipidr implements PCA, PCoA and OPLS(DA) to reveal patterns in data and discover variables related to an outcome of interest. Top associated lipids as well as scores and loadings plots can be interactively investigated using lipidr.

Install lipidr

From Bioconductor

In R console, type:

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("lipidr")

Install development version from GitHub

In R console, type:

library(devtools)   
install_github("ahmohamed/lipidr")

Using Docker

You can use lipidr in a containerized form by pulling the image from docker hub.

docker pull ahmohamed/lipidr
docker run -e PASSWORD=bioc -p 8787:8787 ahmohamed/lipidr:latest

In your browser, navigate to RStudio will be available on your web browser at http://localhost:8787. The USER is fixed to always being rstudio. The password in the above command is given as bioc but it can be set to anything. For more information on how-to-use, refer to Bioconductor help page.

You can access your local files by mapping to the container:

docker run -e PASSWORD=bioc -p 8787:8787 \
  -v "path/to/data_folder":"/home/rstudio/data_folder" \
  ahmohamed/lipidr:latest

You should see data_folder in your working directory.

lipidr's People

Contributors

Stargazers

Watchers

Forkers

jeffreymolendijk sperritt arunabio stolltho nilshoffmann felipe-mansoldo ningbioinfo aliyoussef96 tao0922 rmflight hasihays yonghuidong a1aks

lipidr's Issues

Error in add_sample_annotation

Hi,

I am trying to have lipidr accept my csv with the sample annotations, however I keep getting this error:

> export_annot_path = paste(dir_path,dir_name,"/species_norm_annotations.csv",sep="")
> d <- add_sample_annotation(d, export_annot_path)
Error in .check_sample_annotation(data, annot) : 
  All sample names must be in the first column or a column named "Sample"

I have made sure to call my first column: "Sample". Do I need to follow a specific name for each sample? Is there a way I can have lipidr tell me what sample names it currently has (or is expecting) from the experimental dataset that it took just prior to this?

Annotations csv that is throwing errors: species_norm_annotations.csv

Thanks for your help!

Maybe consider switching the colors in the plot_chain_distribution function?

Hi! I am used to see red as increase and blue as decrease in heatmaps, and the function https://www.lipidr.org/reference/plot_chain_distribution.html shows a blue increase. You could consider changing this? Very nice package overall btw!

mva function issue

Hello,

great code! But mva() is not working properly for method = "PCA". It got a warning and did not work.

Warning messages:
1: 'info.txtC = NULL' argument value is deprecated; use 'info.txtC = 'none'' instead.
2: 'fig.pdfC = NULL' argument value is deprecated; use 'fig.pdfC = 'none'' instead.

Thanks!

Luciana

names not following the pattern 'CLS xx:x/yy:y

I have names like PE P-18:1/18:2, TG 42:0-FA14:0, SM d18:1/20:1. How should I convert these? Thanks!

Lipid Name Parsing for UCLA Core Mass Spec XLSX Report

When inputting the data matrix csv, I am getting an error and cannot continue as this message is thrown:

> lipidr:::.have_lipids_molecules(expt_df[[1]])
[1] FALSE
> annot <- lipidr::annotate_lipids(expt_df[[1]])
Warning in lipidr::annotate_lipids(expt_df[[1]]) :
  Some lipid names couldn't be parsed because they don't follow the pattern 'CLS xx:x/yy:y' 
    PE O-16:0/16:0, PE O-16:0/16:1, PE O-16:0/18:0, PE O-16:0/18:1, PE O-16:0/18:2, PE O-16:0/18:3, PE O-16:0/20:1, PE O-16:0/20:2, PE O-16:0/20:3, PE O-16:0/20:4, PE O-16:0/22:4, PE O-16:0/22:5, PE O-16:0/22:6, PE O-18:0/16:0, PE O-18:0/16:1, PE O-18:0/18:1, PE O-18:0/18:2, PE O-18:0/18:3, PE O-18:0/20:2, PE O-18:0/20:3, PE O-18:0/20:4, PE O-18:0/22:4, PE O-18:0/22:5, PE O-18:0/22:6, PE P-14:0/18:0, PE P-14:0/18:1, 
PE P-16:0/16:0, PE P-16:0/16:1, PE P-16:0/18:0, PE P-16:0/18:1, PE P-16:0/18:2, PE P-16:0/20:1, PE P-16:0/20:2, PE P-16:0/20:3, PE P-16:0/20:4, PE P-16:0/22:4, PE P-16:0/22:5, PE P-16:0/22:6, PE P-18:0/16:0, PE P-18:0/16:1, PE P-18:0/18:0, PE P-18:0/18:1, PE P-18:0/18:2, PE P-18:0/18:3, PE P-18:0/20:2, PE P-18:0/20:3, PE P-18:0/20:4, PE P-18:0/22:4, PE P-18:0/22:5, PE P-18:0/22:6, PE P-18:1/16:0, PE P-18:1/16:1, PE P-18:1/18:0, PE P-18:1/18:1, PE P-18:1/18:2, PE P-18:1/18:3, PE P-18:1 [... truncated]

The lipid names are coming from UCLA Core's Mass Spec lab, so I think they're somewhat common.

Here's a link to the annot list with the strings of the unreadable lipid names:
lipids_annot_list.csv

If I wrote some python to change the string from 'PE O-16:0/16:0' to 'PE_O- 16:0/16:0', would that allow lipidr to parse the name? How would you recommend I name these lipids so lipidr can successfully parse them?

Thank you so much for developing this awesome tool! I'm really excited to use it.

Add annotation with numeric scores

Can I add an annotation with numeric values?

for instance "risk score: 0,1,2,3" :

Sample Risk Score
Sample 1 - - - 0
Sample 2 - - - 1
Sample 3 - - - 2
Sample 4 - - - 1
.
.
etc

or it is just for categorical data?

thanks in advance for your answer.

Volcano plot log2FoldChange

Dear Developer,

I wonder if it is possible to use as logFC cutoff log2(1.5) instead of log2(2) ? Is there a way to set this directly in lipidR plot_results_volcano?

Kind regards

add_sample_annotation

Just to make sure that whenever you add sample annotation you just DONO't use the name of the object in the environment. you need to use the name of the file. for instance:

d <- add_sample_annotation(d, anotation) - this will not work , you will have the error ( Error in .check_sample_annotation(data, annot) :
All sample names must be in the first column or a column named "Sample")

on the other hand,
d <- add_sample_annotation(d, "anotation.csv") - this will work

this is not explained in the documentation but it should

Dealing with missing values

Hi,
Does this package deal with missing values in lipidomics data? They are often due to quantity of a lipid below the detection limit of the device, and are therefore informative. I'm trying to figure out the best way of imputing them. I think this would be a nice feature of a package for lipidomics analysis.
Thanks!

What is ACar?

Hello,

I did a LSEA analysis, but appeared an "ACar". Also, there was not a single significant result. I have several issues concerning lipids names parsing, and I`m trying to figure it out with the guys from the facility.

Thank you again,

Luciana

Problem when loading the lipidr library

Hey there,

Everytime I try to load the psckage, dependencies seems to be loaded correctly, but then this happens:

Error: package or namespace load failed for ‘lipidr’ in dyn.load(file, DLLpath = DLLpath, ...): unable to load shared object '/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/gmm/libs/gmm.so': dlopen(/Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/gmm/libs/gmm.so, 0x0006): Library not loaded: /opt/R/arm64/gfortran/lib/libgomp.1.dylib Referenced from: <47242657-5A5D-3982-936B-398527D642B4> /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library/gmm/libs/gmm.so Reason: tried: '/opt/R/arm64/gfortran/lib/libgomp.1.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/R/arm64/gfortran/lib/libgomp.1.dylib' (no such file), '/opt/R/arm64/gfortran/lib/libgomp.1.dylib' (no such file), '/usr/local/lib/libgomp.1.dylib' (no such file), '/usr/lib/libgomp.1.dylib' (no such file, not in dyld cache)

Do you have any suggestion to overcome this problem?
I'm using a MacBook Pro 16" M1 Max running macOS 13.3 with R 4.2.3

Best regards,
Davide

296 non-parsed molecules

HI,
I have many non-parsed molecules. Is there a way to sort through them, or changes the names into a format lipidr can read?
e.g. non-parsed "Cer[NS] d36:2" "Cer[NS] d38:0" "Cer[NS] d38:2"
but "Cer[NS] d32:1" is read fine…

Thanks,
Chloe

lipid name parsing for class(O+P-chain)

Hi Ahmed,

It seems that lipidr can parse molecules named with class(P-chain) or class(O-chain), which is quite good, but not with class(O+P-chain). sorry if it is a silly question, I'm new to lipidomics.

thanks for your time,
Yao

Batch effect

I wonder how lipidr can manage the batch effect.
I have a dataset of more than 1500 lipids and 200+ samples, but this is a merge of 6 batches.

thanks for your answer

Dealing with multiple molecules with the same total chain length and unsaturation

Is there a way to generate heat maps of isomeric lipids in a similar way as plot_chain_distribution(de_results)?

PCA loading plot

Hi, I would like to know if it is possible to obtain the loading plot in the PCA analysis. I can only get the Score plot. It would help me a lot to have a more accurate view of the data. Thanks!

could not find function "plot_enrichment"

Hi,

I'm having trouble obtaining the desired output for an enrichment analysis. I did the following:

de_results <- de_analysis(d, Core - Monolayer)
Warning message:
Zero sample variances detected, have been offset away from zero
enrich_results = lsea(de_results)
Warning messages:
1: In fgsea::fgsea(pathways = pathways, stats = stats, minSize = minSize, :
You are trying to run fgseaSimple. It is recommended to use fgseaMultilevel. To run fgseaMultilevel, you need to remove the nperm argument in the fgsea function call.
2: In preparePathwaysAndStats(pathways, stats, minSize, maxSize, gseaParam, :
There are ties in the preranked stats (0.73% of the list).
The order of those tied genes will be arbitrary, which may produce unexpected results.
3: In serialize(data, node$con) :
'package:stats' may not be available when loading
4: In serialize(data, node$con) :
'package:stats' may not be available when loading
5: In serialize(data, node$con) :
'package:stats' may not be available when loading
6: In serialize(data, node$con) :
'package:stats' may not be available when loading
7: In serialize(data, node$con) :
'package:stats' may not be available when loading
8: In serialize(data, node$con) :
'package:stats' may not be available when loading
significant_lipidsets(enrich_results)
$Core - Monolayer
[1] "Class_LPI" "total_cs_0" "Class_HexCer"

plot_enrichment(de_results, significant_lipidsets(enrich_results), annotation="class")
Error in plot_enrichment(de_results, significant_lipidsets(enrich_results), :
could not find function "plot_enrichment"

Thanks,
Fernando Tobias

Showing only a subset of lipid classes in the full dataset

Hi,

Is there a way to just show a particular class of lipids in either the enrichment plots or chain distribution heatmaps without removing them in the as_lipidomics_experiment file?

For example, I'd like to only see PC, TAG, Cer on my dataset which contains other lipid classes.

Thanks again,
Fernando

argument "uq_expr" is missing

Hi Ahmed,
when I try to execute the function plot_samples, I receive the following error
Error in is_call(uq_expr, c("after_stat", "after_scale")) :
argument "uq_expr" is missing, with no default
Do you know what the problem could be?
Most of the other functions (like plot_lipidclass) are working fine.
Thanks and best,

Annika

lsea issue

Hi there,
I'm getting the following error, even with the breast cancer dataset.
Any idea how to fix it? thanks in advance

enrich_results = lsea(two_group, rank.by = "logFC")
Error in fgsea::fgsea(pathways = pathways, stats = stats, minSize = minSize, :
argument "nperm" is missing, with no default
In addition: Warning message:
In fgsea::fgsea(pathways = pathways, stats = stats, minSize = minSize, :
There are ties in the preranked stats (9.04% of the list).
The order of those tied genes will be arbitrary, which may produce unexpected results.

cannot be installed in MacOS RStudio

Hi! I am trying to install it on RStudio for MacOS but it seems that cannot be instaled. any solutions

?

Finding the current version of lipidr + lipid names issue

I have been having an issue using the "as_lipidomics_experiment" function due to lipid names. Of the 1578 lipids in my data set, only 370 are matched correctly when I check using the "annotate_lipids" function.

Reading responds to other people having this issue, one problem could be I am using an old version of lipid R. Based on a response to an issue last year, it seems lipidr is currently on version 2.4 or higher? If I install it using bioconductor or the suggested devtools command, I get version 2.13 or 2.15 (note: I updated to the latest version of R to make sure this wasn't the issue). To try to get the latest version, I used the following command: devtools::install_github("ahmohamed/lipidr@*release"). However, this only got me to version 2.2. The names issue still persists in this version.

Could you tell me what the current version is? And how I can go about installing it?

Normalization - Log transformation

I am dealing with a lipidomics dataset extracted from MS-DIAL that consists of peak area data that has been normalized using LOESS algorithm. While several lipids showed significant results in univariate analysis from MS-DIAL I cannot reproduce these results. I would like to ask :
1)which variation of T-test is performed and which method is used to adjust P-values in function de_analysis( )
2)which one is considered the reference group de_analysis(lpd, vitE - vitE_SPL, measure = "Area", group_col = "Group") here 3)what is the base of logFC obtained in the results (I assumed e)
4)if you have any suggestions on modifying the data , e.g. log transformation or some other type of normalization
5)how do the functions set_logged and set_normalized work, i.e. what values does the argument "val" need

With respect

error in running the Vignette example

Hi, I am trying to run your vignette but ran into a problem early on.
line #60

> lipidr::list_mw_studies(keyword = "lipid")
Error: 'list_mw_studies' is not an exported object from 'namespace:lipidr'

Could you let me know what the problem was?

plot_molecules in vignette: colors for molecule and class do not match

Hi,
in the targeted workflow vignette (https://www.lipidr.org/articles/workflow.html)
the calls to plot_molecules produce a mismatch between the Molecule name (e.g. on species level) and the Class that is used for the fill color:
plot_molecules(d_qc, "sd", measure = "Retention Time", log = FALSE)

I traced this to an issue with the order of scale_x_discrete(labels=as.character(dlong$Molecule)):
https://github.com/ahmohamed/lipidr/blob/master/R/plot.R#L293
This also appears in the other related plotting functions for cv and boxplot within plot_molecules.

The following (simplified) code seems to produce the expected result for the boxplot example:

ddf <- lipidr:::to_long_format(d)
ggplot(data=ddf, mapping=aes(x=Molecule, y=log2(Area), fill=Class)) + geom_boxplot() + facet_wrap(~filename, scales = "free_y") + coord_flip()

What kind of numeric measurements does as_lipidomics_experiment expect/assumes?

It is not clear from the documentation what kind of numeric measurements as_lipidomics_experiment expect/assumes. Is it peak area, [Mol], molar fraction [%Mol]?

Surely the distributional properties of the statistical tests may change depending on the type of measurement (e.g., raw vs %).

Showing plot_chain_distribution(data) for multiple groups at the same time

Hello-
In the vignettes, it has been shown being able to show multiple volcano plots at the same time from 2 or more groups. Is it possible to show for plot_chain_distribution(data) with 2 or more groups? While Log2FC scaling remains the same across the groups?

I have done separate plot_chain_distribution(data) for a 3 groups of my data but am having trouble with showing them at the same color scaling. Thank you.

Issues Plotting

When using the "plot_samples" functions I keep running into the following error:
Error in is_call(uq_expr, c("after_stat", "after_scale")) :
argument "uq_expr" is missing, with no default

Do you know why this may be the case?

could not find function "plot_enrichment"

Dear Developer,
Thank you for developing R package. I installed lipidr version 2.12.0, however, I got an error
"Error in plot_enrichment(de_results, significant_lipidsets(enrich_results), :
could not find function "plot_enrichment""

when I run on the example data.

Could you give me some clue? Thanks

Is normalization necessary for differential analysis?

Dear Developer,
I have a question about normalization, I want to identify different analysis for lipidomics data,
when I used PQN to normalize my data, I got very limited different lipids in case vs control.

Someone suggested me I can use the raw data, is it correct? I am quite new for lipidomics data.
In each dataset, only ~70 lipids, I am thinking about using p value not adjusted p value to select different lipids. Is my thought correct?

Thanks.
Happy Good weekend.

Question on the calculation of p values in volcano plots

Hi,

I am looking on lipidr.org, however I am not able to find a page that describes how the p values for the volcano plots are being generated. Can you please clarify?

Thank you!

-Vincent

a question about lipid name

Hi Ahmed,
Thank you for developing this great tool.
I have a question about the lipid name. Based on the tutorial, the lipid name must be CLS xx:y. I am a new for lipidomics data. what do xx and y mean? I saw some lipids' name like this "DG 14:0/18:1/0:0", what do those number mean?
If I have a lipid name, such as "
(+/-)14,15-EET_1" and "9-HpODE(2)", how I can convert those names to the format lipidr requires?
Thank you.
Look forward to your reply.

Normalized?

Just to make sure,

when as_lipidomics_experiment asks if the data is "normalized", it is NOT referring to "standardization", right?

with "standardization", I mean a mean of 0 (zero) and SD of 1.

I ask this because in data science it is pretty common to use both terms interchangeably

no TG nor DG in list

Hi!
Thanks for this package. I am wondering if it can be true that neither triglycerides nor diglycerides is included in the lipid reference list, and hence cannot be analyzed?

Error in .check_sample_annotation(data, annot)

I am new to lipidR (and R, tbh). I created a csv file for my MS lipid profile data and another for annotation following the guidelines. LipidR object was created with no issues. Annotation table has 1st column titled "Sample" and sample names match data table.

When running d <- add_sample_annotation(d, "data_clin.csv") I get the following error:
Error in .check_sample_annotation(data, annot) :
All sample names must be in the first column or a column named "Sample"

First column is samples and is named "Sample". Still can't get past this error because I don't know how to fix this. I am losing my mind. What can I do to annotate my table? I am screaming.

If this is not the right place to ask this, I would appreciate if anyone can point me to the right direction. Thanks.

Show labels on plot_mva

Is there any possibility to show the sample labels when running plot_mva()?

I was able to do that by catching some of the code inside function plot_mva() as follows

    ret <- .get_mds_matrix(mvaresults, components, color_by)
    mds_matrix <- ret$mds_matrix
    color_by <- ret$color_by
    cols <- colnames(mds_matrix)
ggplot(mds_matrix, aes_string(cols[[2]], cols[[3]], 
        label = "Sample", color = color_by)) + geom_point(size = 3, 
        pch = 16) + geom_text(vjust = -0.5, size = 3, color = "black")

so this piece of code shows my labels but the plot is not nice as running:

plot_mva(mvaresults, color_by="Age", components = c(1,2))

Thank you so much.

annotate_lipids error

Trying this example

lipid_list <- c( "Lyso PE 18:1(d7)", "PE(32:0)", "Cer(d18:0/C22:0)", "TG(16:0/18:1/18:1)" )
ann<-lipidr::annotate_lipids(lipid_list)

I get this error message:

Show in New Window Error in UseMethod("filter") : no applicable method for 'filter' applied to an object of class "NULL"
3. filter(., Molecule %in% molecules)
2. def[, c("Molecule", "clean_name", "ambig", "not_matched", "istd")] %>% filter(Molecule %in% molecules)
1. lipidr::annotate_lipids(lipid_list)

Happy to help resolve it.

Error when running

Dear Ahmohamed,

Thank you for a great tool and easy to follow examples.

I am new to R and to lipids so I am sorry if this is problem i should have been able to fix myself, i feel like i might be missing an update in R or something. However, when i run my own dataset as well as when i follow along in your example https://www.lipidr.org/articles/examples/mw_integration.html "Data mining using lipidr" you have uploaded I keep get an error i do not understand when i run the scripts: plot_samples(d, "tic") or plot_samples(d, "boxplot")

and the error i get for both of them is:
Error in is_call(uq_expr, c("after_stat", "after_scale")) :
argument "uq_expr" is missing, with no default

Hope you can help,
Kind regards
Kristina

Suggestion for Unsaturation analysis and chain length

Hi translational med postdoc here!

One quick suggestion. It would be exptremely beneficial if :

plot_enrichment(
de.results,
significant.sets,
annotation = c("class", "length", "unsat"),
measure = "logFC"
)

can also select the type of lipid specie we want to analyze for unsaturation and chain length.

For instance, from a biochemical point of view it would be fair to analyze Chain length and unsaturation enrichment in Ttriglycerides (TG) only.

hope this canbe implemented.

Thxs for all!!

lipid nomenclature

Lipids such as Coenzyme Q are not parsed because they don't follow the pattern 'CLS xx:x/yy:y'. Is there a way around this?

Incorporating the multi-level experiments from Limma into Lipidr

Hi, I am using lipir to identify differentially expressed lipids in my experiment design, and I have difficulty incorporating some of the analysis used in limma into the lipir. I have an experimental design that goes like that: Group (control, aerobic exercise, and resistance exercise), time-course (pre, 0 min, 30 min), and 8 subjects (all subjects went through all the procedures, paired design). I ran all the initial parts of the package, but the statistical part is giving me trouble.

#Design
f <- paste(d_normalized$Group, d_normalized$Time, sep = ".")
design <- model.matrix(~0+f, data = colData(d_normalized))
colnames(design)<-levels(f)

#Then estimate the correlation between measurements made on the same subject
mat <- assay(d_normalized, "Area") (thanks, Ahmed to show me how to extract the matrix)
corfit<-duplicateCorrelation(mat ,design,block=d_normalized$Subject)
corfit$consensus

#Then this inter-subject correlation is input into the linear model fit:
fit<-lmFit(mat,design,block=d_normalized$Subject,correlation=corfit$consensus).

According to limma package, now I would have to make contrasts and then compute them and moderated t-tests using contrasts.fit and ebayes. How would I perform those steps after I ran the lipir analysis:

de_results= de_design(data=d_normalized,
design = design,
ALL contrasts,
measure="Area")

Thanks in advance

Volcano plot eBayes f or t statistic

Hi Ahmed,

when creating a volcano plot (plot_results_volcano), lipidr calls the eBayes function. I don't understand whether a moderated f or t statistic is computed/used. Can you tell me which one it is?
Thank you for your time!
Best,

Annika

How to extract the area matrix?

Hi there,
I am having trouble with extracting the data matrix after doing the analysis.
Do you suggest any function (like getAssayData) to get the data?
Thank you.

lipidR error when loading package

Hi, I just installed lipidR but when I load the package I get this message:

Error: package or namespace load failed for ‘lipidr’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
namespace ‘pillar’ 1.8.1 is already loaded, but >= 1.9.0 is required

I am using last version of Rstuido (build 386) and R4.2.2.

(I am new on GitHub, sorry if I am missing any info here).

Does anyone know how to fix this error?
Thank you

LogFC calculations

Hi Ahmed,
I am trying to calculate the LogFC between two treatment groups: Tm100 and Tm1. As per the result of de_analysis(), the values for significant molecules are in the range of 1e+04 to 1e+08, which is absolutely not possible just by looking at their raw "Area" values.

I have run normalization on two different files and then combined them to create one to run de_Analysis(). please help me with this. Data and annotation files are attached.
jh_an.csv
jh_an2.csv
jh_data.csv
jh_data2.csv
Here is the code used:

library(lipidr)
library(data.table)
library(limma)
library(ggplot2)
Data1 <- read.csv("jh_data.csv")
#Data$Classnew <- paste(Data$Lipids, Data$Class, sep = "_")
Data1 <- Data1[,c(2:10)]
d1 <- as_lipidomics_experiment(Data1)
Ann1 <- read.csv("jh_an.csv")
d1 <- add_sample_annotation(d1,Ann1)
d1_normalized = normalize_pqn(d1, measure = "Area", exclude = "blank", log = TRUE) 
plot_samples(d1_normalized, "boxplot")
Data2 <- read.csv("jh_data2.csv")
Data2 <- Data2[,c(2:10)]
d2 <- as_lipidomics_experiment(Data2)
Ann2 <- read.csv("jh_an2.csv")
d2 <- add_sample_annotation(d2,Ann2)
d2_normalized = normalize_pqn(d2, measure = "Area", exclude = "blank", log = TRUE) 
plot_samples(d2_normalized, "boxplot")
R <- cbind(d1,d2) // combining the two lipidomic experiment//
##Multivariate analysis
mvaresults = mva(R, measure="Area", method="PCA")
plot_mva(mvaresults, color_by="Sample.Type", components = c(1,2))
#Differential analysis
de_results = de_analysis(
  data=R, Tm100-Tm1, 
    measure="Area")
head(de_results)
significant_molecules(de_results, p.cutoff = 0.05, logFC.cutoff = 1)
plotgg <- plot_results_volcano(de_results, show.labels = TRUE)

Lipidr query

Hi Ahmed, First of all, thanks for creating such a powerful analysis tool.
I am a postdoc at Optometry, UC Berkeley and our lab routinely performs lipidomics for detecting PUFAs and their metabolites like Lipoxins, resolvins, and maresins, etc using MRM. Currently, we use Analyst software for the peak integration and use peak area to calculate concentrations based on standards in Microsoft excel. We wanted to incorporate the PCA analysis for the larger datasets and analyze the data without any bias. That's where your tool is very promising. I wanted to ask you a few doubts:

Do we need to convert all the lipid names to the format mentioned CLS xx:y since some of the LTB4 FA 20:4;O2 is according to the format.
Since our analysis is targeted and focussed on one classification, is it possible to analyze our data having different groups and concentration treatments for different tissues?
Since we use Analyst software for the peak integration and are not experienced in using Skyline, is it possible to create input files in the correct format because the Analyst creates one parameter (Area, RT, etc) at a time? Do we need to give multiple files for Area, Height, RT to lipidr?
Please do let me know so that I can proceed with using lipidr and our analysis.
Thanks in advance for your help.
Regards,
Shubham

Issue with "as_lipidomics_experiment" function

Hi Ahmed,

I'm a 3rd year PhD Student new to coding with R. Unfortunately I am encountering an issue with the lipidr package. In response to:

d <- as_lipidomics_experiment(read.csv(dm_path))

R returns:
Error in as_lipidomics_experiment(read.csv(dm_path)) :
could not find function "as_lipidomics_experiment"

Any help would be much appreciated.

Thank you,
Rotimi

Two factor analysis with interaction

In my lipidomic study, I have treated cells with siRNA to reduce the expression of gene A and gene B. Some cells were untreated, some cells were treated with siRNA for gene A, some cells were treated with siRNA for gene B, and some cells were treated with siRNA for gene A + siRNA for gene B.

Does lipidr allow for two-factor analysis (with interaction) by coding the treatment group as two columns (A and B) containing 0 and 0, 1 and 0, 0 and 1, or 1 and 1 for each sample depending on which of the treatments above was applied?

Is it possible to look at interaction between gene A and gene siRNA knockdown?

I understand I can just have one treatment group with 4 levels (untreated,a,b,a+b) but this would remove valuable information from the model, which is likely to hurt power.

de_analysis, group_col names

Hi!,

having this in the documentation: "de_analysis(data, ..., measure = "Area", group_col = NULL)"

I created a lipidomics experiment and then added the annotation ("d1"). The data set of the annotation has two columns (the first is "Sample" and the second is "fibrosis_score_adj", this score are numerical ranges like 0-2 vs 3-4) when I set the name of the column in the "group_col" argument:

de_analysis(d1, 0-2 - 3-4, measure = "Area",
group_col = fibrosis_score_adj)

the output is:

Error in de_analysis(d, 0 - 2 - 3 - 4, measure = "Area", group_col = "fibrosis_score_adj") :
No contrasts provided

How can I make de_analysis work properly?

and also, what do you recommend to put first in the "..." argument of de_analysis: the control or the experimental?

kind regards

by the way: Yes, the categories inside "fibrosis_score_adj" are "0-2" and "3-4"

Can't make a lipiomics experiment due to lipid names

While trying to create a lipidomics experiment from a a csv I have loaded into a tibble,

exp <- as_lipidomics_experiment(exp_df, logged = FALSE, normalized = TRUE)

I keep getting this error:

Error in as_lipidomics_experiment(exp_df, logged = FALSE, normalized = TRUE) : Data frame does not contain valid lipid names. Lipids features should be in rownames or the first column.

My first column is a character vector with names that look like this:

> sample(exp_df$lip_name, 50)

 [1] "TG 16:0/12:0/20:3" "MePC 36:3"         "SM 40:1"           "TG 25:1/18:1/18:3" "LPC 18:2"          "PC 18:4/16:0"     
 [7] "Cer 19:0/24:1"     "PI 16:0/18:1"      "LPC 17:1"          "Cer 19:1/25:0"     "LPC 16:1"          "MePC 37:3"        
[13] "TG 20:0/16:0/18:0" "TG 14:0/20:5/22:6" "PE 18:2/20:5"      "TG 30:0/18:0/18:0" "SM 42:1"           "Hex2Cer 18:1/22:0"
[19] "TG 20:4/22:6/22:6" "LPC 18:3"          "SM 30:0"           "ChE 24:1"          "TG 18:1/18:2/21:1" "SM 38:1"          
[25] "TG 16:1/18:2/18:2" "dMePE 20:2/22:6"   "PC 20:3/18:2"      "PC 14:0/22:6"      "LPC 20:1"          "SM 31:1"          
[31] "MePC 35:1"         "TG 16:0/17:1/18:1" "PC 16:0/20:4"      "LPC 20:1"          "SM 44:1"           "TG 19:1/18:1/18:2"
[37] "TG 16:0/22:1/22:6" "TG 15:0/12:0/22:6" "TG 11:0/15:0/17:1" "TG 16:0/16:1/18:1" "TG 16:0/11:1/20:4" "MePC 34:8"        
[43] "PC 22:5/18:2"      "DG 18:1/18:1"      "PAF 12:1"          "OAHFA 48:1"        "TG 12:0/17:1/18:2" "ST 18:1/22:0"     
[49] "TG 16:0/14:0/18:3" "PC 18:1/24:1"   

What am I doing wrong?

ahmohamed / lipidr Goto Github PK

lipidr's Introduction

lipidr: Data Mining and Analysis of Lipidomics Datasets in R

See full guide at lipidr.org

Overall workflow

Input

Numerical Matrix

Export from Skyline

LipidomicsExperiment Object

Quality control & plotting

Univariate Analysis

Multivariate Analysis

Install lipidr

From Bioconductor

Install development version from GitHub

Using Docker

lipidr's People

Contributors

Stargazers

Watchers

Forkers

lipidr's Issues

Recommend Projects

Recommend Topics

Recommend Org