Giter Site home page Giter Site logo

pmartr / pmartr Goto Github PK

View Code? Open in Web Editor NEW
36.0 11.0 16.0 83.52 MB

The pmartR R package provides functionality for quality control, normalization, exploratory data analysis, and statistical analysis of mass spectrometry (MS) omics data, in particular proteomic (either at the peptide or the protein level), lipidomic, and metabolomic data.

Home Page: https://pmartr.github.io/pmartR/

License: Other

R 96.72% C++ 2.78% HTML 0.01% TeX 0.49%
metabolomics-data peptides mass-spectrometry data-summarization proteins metabolites lipids rna-seq-analysis

pmartr's Introduction

pmartR

DOI R-CMD-check CRAN status

This R package provides functionality for quality control processing, statistical analysis and visualization of mass spectrometry (MS) omics data, in particular proteomic (either at the peptide or the protein level; isobaric labeled or unlabled), lipidomic, and metabolomic data. This includes data transformation, specification of groups that are to be compared against each other, filtering of feature and/or samples, data normalization, data summarization (correlation, PCA), and statistical comparisons of groups of interest (ANOVA and/or independence of missing data tests). Example data to be used with this packages can be found in pmartRdata.

Installation:

This package makes use of several packages hosted on BioConductor. If you are encountering warnings about unavailable BioConductor packages such as pcaMethods, you may need to add them to options("repos"):

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

options("repos" = BiocManager::repositories())

(Recommended) Install from CRAN:

install.packages("pmartR")

# or 

BiocManager::install("pmartR")

To install the latest release:

devtools::install_github("pmartR/pmartR@*release")

To install a specific release, say v2.4.0:

devtools::install_github("pmartR/[email protected]")

(Not recommended, since these changes are likely still being tested) You can also install the latest changes to master:

devtools::install_github("pmartR/pmartR")

Problems with rcppArmadillo and gfortran on mac

There is a problem that causes pmartR to fail compiling cpp code, which has something to do with rcppArmadillo and certain installations of gfortran. See these posts that try to explain the issue: 1 2 3. Two solutions we have found:

  1. Install gfortran from a recommended source (not homebrew):
  2. When using the homebrew gfortran installation, add the line FLIBS = -L`gfortran -print-file-name=libgfortran.dylib | xargs dirname` to ~/.R/Makevars (a plain text file with no extention)

gfortran and Apple silicon (M1/M2 chips)

There are similarly issues with compilation in newer Mac chips. We recommend to install gcc-13 from homebrew brew install gcc or the universal version from https://mac.r-project.org/tools/.

Additionally, some users experience errors with ld: Assertion failed ... as seen here. One solution is to use the old linker by making sure gcc uses the flag -ld64 (Xcode docs). To do this, you can edit ~/.R/Makevars to include this flag, for example by appending it to LDFLAGS with +=:

# in ~/.R/Makevars
LDFLAGS+=-ld64

or specifying it in your compiler command:

# in ~/.R/Makevars
CC=/usr/local/bin/gcc -ld64

Tutorial:

To get started, see the package documentation and function reference located here.

Data:

Example peptide (both unlabeled and isobaric labeled), protein, metabolite and lipid data are available in the pmartRdata package available on Github, here

Citation:

To cite this package, please the following:

Degnan, D. J.; Stratton, K. G.; Richardson, R.; Claborne, D.; Martin, E. A.; Johnson, N. A.; Leach, D.; Webb-Robertson, B.-J. M.; Bramer, L. M. PmartR 2.0: A Quality Control, Visualization, and Statistics Pipeline for Multiple Omics Datatypes. J. Proteome Res. 2023, 22 (2), 570–576. https://doi.org/10.1021/acs.jproteome.2c00610.

BibTex:

@article{degnan2023pmartr,
  title={pmartR 2.0: A Quality Control, Visualization, and Statistics Pipeline for Multiple Omics Datatypes},
  author={Degnan, David J and Stratton, Kelly G and Richardson, Rachel and Claborne, Daniel and Martin, Evan A and Johnson, Nathan A and Leach, Damon and Webb-Robertson, Bobbie-Jo M and Bramer, Lisa M},
  doi = {10.1021/acs.jproteome.2c00610},
  journal={Journal of Proteome Research},
  year={2023},
  publisher={ACS Publications}
}) 

Disclaimer:

This material was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor the United States Department of Energy, nor Battelle, nor any of their employees, nor any jurisdiction or organization that has cooperated in the development of these materials, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness or any information, apparatus, product, software, or process disclosed, or represents that its use would not infringe privately owned rights.

Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof, or Battelle Memorial Institute. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

  PACIFIC NORTHWEST NATIONAL LABORATORY
  operated by BATTELLE for the
  UNITED STATES DEPARTMENT OF ENERGY
  under Contract DE-AC05-76RL01830

pmartr's People

Contributors

abell8 avatar clabornd avatar david-degnan avatar evanamartin avatar evanglass avatar godi678 avatar leac176 avatar lmbramer avatar olivroy avatar rarichardson92 avatar rarichardson92-spork avatar stanfill avatar stanfill-pnnl avatar stratkg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pmartr's Issues

Stats comparison options

For the results of Stats objects -- Allow both G-test and ANOVA run across selected group comparisons, then remove the results that are not valid instead of restricting the G-test/ANOVA options for the omicsObject

combine_lipids errors our when duplicate e_meta names are present

If there are two objects with e_meta columns "A", "B", "C", with emeta_cname's "A" and "C" respectively, the second object will attempt to rename the "C" column to "A" in order to do a dplyr::bind_rows, but this fails because theres already an "A" column.

An example (ask Daniel or Kelly for this data):

edata <- read_csv("~/Documents/Data/example_omicsdata/lipids-error-testing/lipid_neg_edata_no_dups.csv")
fdata <- read("~/Documents/Data/example_omicsdata/lipids-error-testing/lipid_neg_fdata.csv")
emeta <- read_csv("~/Documents/Data/example_omicsdata/lipids-error-testing/lipid_neg_emeta.csv")

myobj <- as.lipidData(edata, fdata, emeta, edata_cname = "Lipid_NoDups", fdata_cname = "SampleID", emeta_cname = "Lipid")

edata2 <- read_csv("~/Documents/Data/example_omicsdata/lipids-error-testing/lipid_pos_edata_no_dups.csv")
emeta2 <- read_csv("~/Documents/Data/example_omicsdata/lipids-error-testing/lipid_pos_emeta.csv")

myobj2 <- as.lipidData(edata2, fdata, emeta2, edata_cname = "Lipid_NoDups", fdata_cname = "SampleID", emeta_cname = "Lipid_NoDups")

myobj <- group_designation(myobj, "Virus")
myobj2 <- group_designation(myobj2, "Virus")

combine_lipidData(myobj, myobj2)

Extra check for pair_id + pair_group

Came across this with an error in pmartR::summary.imdanovaFilt() from an imd_anovafilter made on a paired data object. Turns out I was mis-specifying the pair_group column such that certain pairs did not have two levels in pair_group. Still making this issue in case we want to have an extra check for user error.

There is a check for having only 2 levels in the pair group column, but I think we also want a check for having exactly 1 entry in pair_group that == pair_denom for every pair.

I wrote up a check in this branch if it helps explain what I mean.

Trelliscope Functions: Fixes and new Features

  • Expand trelliscope building function to ensure e_meta columns end up as cognostics
  • Add parameter to trelliscope functions to quickly change the order of discrete axes
  • Fix long plot names error

as.proData replicates

as.proData requires replicates to be technical replicates. Does it make a difference for the analysis if my three replicates are biological replicates?

Possible memory issue with imd_anova()

Or maybe it's not this function and it's something with the dataset that we're using or the non-pmartR code that's run prior to imd_anova(), because so far we only see the following error with one particular dataset.

Please contact Kelly Stratton and Damon Leach for the specific code and data files that result in this error:

“GC encountered a node (0x7fcb0502ba88) with an unknown SEXP type: 29 at memory.c:1817”

Summary function missing group info for pep, metab, and proData

For pepData, summary() will return counts for each group designation, but groups don't register for other data objects. Looking at the code in the R folder, it seems like error might be due to the use of levels() for non-pepData objects (throwing a NULL) while the pepData summary uses unique() to extract grouping data? The error can be reproduced in the code below:


library(pmartRdata)
data(lipid_object)
lipid_object2 <- group_designation(omicsData = lipid_object, main_effects = "Condition")
summary(lipid_object2)

data(pro_object)
pro_object2 <- group_designation(omicsData = pro_object, main_effects = "Condition")
summary(pro_object2)

data(metab_object)
metab_object2 <- group_designation(omicsData = metab_object, main_effects = "Condition")
summary(metab_object2)

data(pep_object)
peo_object2 <- group_designation(omicsData = pep_object, main_effects = "Condition")
summary(peo_object2)

Request for small helper_fn changes - Error vs. Null

A few helper functions return errors in certain conditions where the attribute fetched does not exist for various reasons; it might be more helpful if these returned NULL with a warning? Specifically thinking about generic workflows applied across different datasets, where it can check for the attribute but catch null values in an if() statement. A few I've run into:

get_emeta_cname - errors where emeta does not exist
get_filters - errors when no filters have been applied

(Group info helpers also error if group designation hasn't been run, but I feel that is reasonable)

minor vignette updates

In the "Typical_Processing_Workflow" vignette, update the code chunk starting on line 410 (tag is "spans_normalize") so that it reflects this screen shot. Currently this code chunk is identical to the one above it, and we need to actually apply the normalization to the pepData object "mypep".

image

In the "Filter_Functionality" vignette, add an example of using the imdanova_filter() function with the "comparisons" argument specifying some custom comparisons (and not just the default of all pairwise). Recommend adding this example right before the ### Custom Filter section. Can use the following text and code chunk:

By default, the IMD-ANOVA filter assumes that all pairwise comparisons will be performed in downstream use of the imd_anova() function. A user can specify custom comparisons using the "comparisons" argument when they apply the IMD-ANOVA filter. For instance if the comparisons of interest are Phenotype2 versus Phenotype1 and Phenotype3 versus Phenotype1, the following code can be used.

mypep_imdanovafilt <- applyFilt(filter_object = myimdanovafilt, omicsData = mypep_groups_log2, min_nonmiss_anova = 2, min_nonmiss_gtest = 3, comparisons = data.frame(Control = c("Phenotype1", "Phenotype1"), Test = c("Phenotype2", "Phenotype3")))

Error in rmd plotting from pair_id peptides from plot_fn.R

When specific sampleID is not specified in rmd and p-value threshold is specified, goodies_alpha <- filter_object$pvalue < pvalue_threshold as boolean of valid biomolecules

Variable goodies_pch is defined when:

  1. pair_id is not null and sum(goodies_alpha) > 0
  2. pair_id is null

goodies_pch is defined as a boolean samples matching to "condemned" (paired with an outlier above threshold) that determines shape and size of points.

Fix should add on 3835 an appropriate value for the shape when no p-values are above the threshold in theory -- else goodies_pch <- rep(FALSE, nrow(filter_object))

emeta_cname shows up in object

I think this is a consequence of a purrr::list_modify change in 1.0.0 https://github.com/tidyverse/purrr/releases/tag/v1.0.0.

Basically, the call Evan M had in as.xxData did work when:

mylist = list("a" = 1, "b" = 2)
purrr::list_modify(mylist, "b" = NULL)

returned

list("a" = 1)

but now it returns

list("a" = 1, "b" = NULL)

Will think about how to fix, but in the meantime see if your purrr version is >= 1.0.0 and downgrade if so.

plot.imdanovaFilt fails when specifying both min_nonmiss options

On master 6084612, caught in the app, happens somewhere in this guide_legend call it seems:

guide = ggplot2::guide_legend(

library(pmartR)
library(pmartRdata)
#Transform the data
mypepData <- edata_transform(omicsData = pep_object, data_scale = "log2")

#Group the data by condition
mypepData <- group_designation(omicsData = mypepData, main_effects = c("Condition"))

#Apply the IMD ANOVA filter
imdanova_Filt <- imdanova_filter(omicsData = mypepData)

# no errors
plot(imdanova_Filt) # blank plot
plot(imdanova_Filt, min_nonmiss_gtest = 3)
plot(imdanova_Filt, min_nonmiss_anova = 2)

# error
plot(imdanova_Filt, min_nonmiss_anova = 2, min_nonmiss_gtest = 3)
> Error in `[[<-.data.frame`(`*tmp*`, i, value = c(0, 2, 0, 2)) : 
  replacement has 4 rows, data has 2**

packageVersion("ggplot2")
[1] ‘3.3.5’

add a thresholding option to edata_replace, so that everything below a specified threshold is considered NA

From Che:

I use pmartR a lot for my GCMS data, recently I’ve started using it for LCMS data as well, however I see one issue. We can use “0” to denote missing values, but in addition to that, can we use a threshold. Say anything less than 1e5 is considered missing? The reason for this is that many programs we use to process LCMS data will never put something as 0 (there is always something), but if it is less than some threshold, we know it is ‘missing’ or ‘not detected’. This is the same for HiRes GC data.

imdCov branch: for anova_filter and gtest_filter, default comparisons = NULL

Per chat with Lisa and Kelly on 31May2022, we would like the default for the comparisons argument in anova_filter() to be NULL; I believe its already implemented where if null, defines comparisons as all comparisons -- just want to give the argument a default so it lines up with previous implementations and doesn't need to be explicitly defined as NULL

dim_reduction() function needs minor fix line 76

From Jan, working off the main branch:

I saw something that looked off on line 76 in pmartR’s dim_reduction.R file. The code is supposed to remove features with near zero variance, but the assignment portion of that task is missing.

Line 76:
temp_data[-minvars, ]

I think line 76 should mimic line 70 and read:
temp_data = temp_data[-minvars, ]

Group counts not always returned by imd_anova()

Group counts are missing when the test method is "combined" and a biomolecule is filtered by the G-test but is present in ANOVA. The reason is counts are computed in the G-test portion of the function. Therefore, only the biomolecules that have counts above the G-test threshold will have group counts computed.

Columns names change due to the check.names argument in the data.frame function

Functions that create a data frame internally do not work correctly when non-standard column names (e.g., the name begins with a number or contains a special character) are present in e_data, f_data, or e_meta. The restriction on what characters a column name can contain are determined by the data.frame function.

Techrep objects throw an error in edata_replace

Peptide data with technical replicates seem to break edata_replace? Error can be reproduced with code below:


library(pmartRdata)
data(techrep_pep_object)
techrep_pep_object2 <- edata_replace(omicsData = techrep_pep_object, x=0, y=NA)

### Odd since there seems to be no problem here:
techrep_pep_object$f_data$RunID %in% colnames(techrep_pep_object$e_data)

### Doesn't work for a newly constructed techrep object either:
techrep <- as.pepData(e_data = techrep_edata, f_data = techrep_fdata, edata_cname = "Mass_Tag_ID", fdata_cname = "RunID", techrep_cname = "TECH_REP", data_scale = "log2")

techrep2 <- edata_replace(omicsData = techrep, x=0, y=NA)

Peptide specific imd_anova count fix

Applying imd_anova on non-peptide data throws an error; seems to look for "Peptide" to no avail. Possible fix in slightly adjusting calls as well as changing to more general edata cname? Reproducible code and error below:

lipid_object       <- pmartRdata::lipid_object
lipid_object       <- pmartR::group_designation(lipid_object, "Condition")
lipid_object2      <- pmartR::edata_transform(lipid_object, "log2")

lipid_object2 <- pmartR::applyFilt(pmartR::imdanova_filter(lipid_object2),
                                   lipid_object2,
                                   min_nonmiss_anova = 2,
                                   min_nonmiss_gtest = 3)
lipid_stats        <- pmartR::imd_anova(lipid_object2, 
                                        test_method = "combined")

Error in Peptide %in% as.character(to_fix) : object 'Peptide' not found


Code chunk in imd_anova:

#Get counts for rows in "anova_results" but not in "imd_counts"
  final_cnts <- Full_results[,grep("Count",colnames(Full_results))]
  msng_cnts <- which(is.na(rowSums(final_cnts))) 
  if(length(msng_cnts)>0){
    to_fix <- Full_results[msng_cnts,]$Peptide
    omicsData2 <- omicsData
    omicsData2$e_data <- omicsData$e_data%>%filter(Peptide%in%as.character(to_fix))
    new_cnts <- imd_test(omicsData = omicsData2, comparisons = NULL, pval_adjust = 'none', pval_thresh = pval_thresh)
    rm(omicsData2)
    #Replace the NA counts with the correct counts
    Full_results[msng_cnts,grep("Count",colnames(Full_results))] <- new_cnts$Results[,grep("Count",colnames(new_cnts$Results))]
  }

combine_lipidData duplicates f_data

When combining lipid data, it duplicates the f_data leaving you with for example a method.x and a method.y column. Each contains the same information.

More options for plot.naRes

the missing values plot function:

plot.naRes <- function (naRes_obj, omicsData, plot_type = "bar",

Has an option for ordering and coloring by a value in f_data. However we also want to be able to color/order by the group assignment in group_DF (attr(object, 'group_DF')$Group).

We also want the option for the bar heights to be:

  • the number of NON-missing values as well as missing values
  • the proportion of missing/non-missing values

enable use of tibbles

Currently, tibbles cause the as. functions to error (column names are causing issues)

imdanova throws error if no imdanova filter has been run

if no imdanova filter has been applied, imdanova function should check to see whether there would be any molecules removed with default min_num arguments, and give a warning that the user should consider using the imdanova filter before running imdanova.

imdanova_Filt & imd_test

Hi,

I am comparing a list of proteins for 2 conditions (each with 3 biological replicates) and I am trying to qualitatively identify proteins which are present in all 3 samples for one conditions but absent from all 3 replicates in the second condition ( NA NA NA vs Present Present Present). However the imdanova_Filt removes these entries and I was wondering why since this is exactly what I am looking to compare. Also when I run imd_test without filtering, I receive significant p-values for these proteins, but all the p-values are the same. Shouldn't it be based on each distribution of present proteins somehow as well, not just presence or absence?

Recent cname attribute passed in imd_anova() only applies for "combined" test method

The recent update for the cname attribute included in "statRes" object from imd_anova() only seems to apply for "combined" test method. From looking at the imd_anova() code in the R folder, it seems like the addition was made before the last return statement and not the return statements for if test_method equals 'anova' or 'gtest'.
The issue can reproduced with the below code:


library(pmartR)
library(pmartRdata)
#Transform the data
mypepData <- edata_transform(omicsData = pep_object, data_scale = "log2")

#Group the data by condition
mypepData <- group_designation(omicsData = mypepData, main_effects = c("Condition"))

#Apply the IMD ANOVA filter
imdanova_Filt <- imdanova_filter(omicsData = mypepData)
mypepData <- applyFilt(filter_object = imdanova_Filt, omicsData = mypepData, min_nonmiss_anova=2)

#Implement the IMD ANOVA method and compute all pairwise comparisons (i.e. leave the `comparisons` argument NULL)
anova_res <- imd_anova(omicsData = mypepData, test_method = 'anova')
imd_res <- imd_anova(omicsData = mypepData, test_method = 'gtest')
imd_anova_res <- imd_anova(omicsData = mypepData, test_method = 'comb', pval_adjust='bon')

attr(imd_anova_res, "cname")

imd_anova_res <- imd_anova(omicsData = mypepData, test_method = 'gtest', pval_adjust='bon')

attr(imd_anova_res, "cname")

imd_anova_res <- imd_anova(omicsData = mypepData, test_method = 'anova', pval_adjust='bon')

attr(imd_anova_res, "cname")

imdanova_filter behavior with no missing values

A couple of issues when a data object has no missing values:

  • imd_anova throws an error saying that imdanova_filter hasn't been run, when this is unnecessary
  • imdanova_filter doesn't work when no rows have missing values

plot.proteomicsFilter needs attention (especially title font and text)

plot.proteomicsFilter has the option for specifying title font size and title text, but I've encountered the following problems and questions:

  1. Specifying the font size is not reflected when you knit your Rmd file (however, it is reflected in the preview of the Rmd file)
  2. Is there a way to only have it display one of the plots? If not, we should add this. This detail should be added to the documentation.
  3. How do you specify titles for both plots, if you have it display both (which is the default)?
  4. Check the other plot options for this particular S3 object to make sure they all work appropriately.

isobaric pepData object after normalization has two normalization attributes - one TRUE, one FALSE

I noticed that correlation plot for isobaric-normalized data shows "(Un-Normalized Data)" in the title.

Then I found that the normalized object has two attributes:

> pepData_iso_norm = normalize_isobaric(pepData_iso, apply_norm = T)
> attributes(pepData_iso_norm)$data_info$norm_info
FALSE
> attributes(pepData_iso_norm)$isobaric_info$norm_info
TRUE

and the corRes object picks up only the first one.

Is it a bug? Do we need both norm_info attributes?

Error message with plot.proData()

library(pmartRdata) data("pro_object") plot(pro_object, order_by = "Condition")
produces error message: "Error in .subset2(x, i, exact = exact) : attempt to select less than one element in get1index"

Developer: trim build/check time

R CMD Check still takes 10+ minutes to run, there were some spots identified to trim build time. The big one was the tests for DESeq2, locally they take about 5 minutes, probably longer in CI. Maybe cutting down the data size would help things run faster, I think the other test datasets could be trimmed as well, though this will be a bit of a pain having to change the expected dimensions in tests.

Track down and destroy warnings/notes in R-CMD-Check [CRAN]

There are lots of lingering notes/warnings in R CMD Check/devtools::check() which we'll need to fix if we want to submit to CRAN. This is the super-issue for all such warnings/notes - if one of them is sufficiently complicated we can branch it off into another issue.

check in pre-flight causing protein rollup to crash

From the example in protein_quant():

devtools::load_all(".")
library(pmartRdata)

mypepData <- group_designation(omicsData = pep_object, main_effects = c("Condition"))
mypepData = edata_transform(mypepData, "log2")
mypepData <- applyFilt(molecule_filter(mypepData), mypepData)

results<- protein_quant(pepData = mypepData, method = 'rollup', combine_fn = 'median', isoformRes = NULL)

>  Error in pre_flight(e_data = e_data, f_data = f_data, e_meta = e_meta,  : 
  Not all e_data cname and e_meta cname combinations are unique

It seems that in this chunk we are checking to see if there are no duplicate rows in e_meta[,c(edata_cname, emeta_cname)]. However this will happen during peptide rollup, since edata_cname == emeta_cname == protein column. The protein column will have duplicates, and so we fail here.

imd-anova incorrect p-values

I believe what is happening is these greps:

control_i <- grep(to_compare_df$Control[i],groups)
which are used to build the comparison matrix for ANOVA are too loose.

For example, with groups "2M", "12M", "24M", the grep on the "2M" group will match "2M" AND "12M", and cause the comparison matrix to update incorrectly.

comps2 = data.frame(Control = c("2M", "2M", "12M"), Test = c("12M", "24M", "24M"))
anov_pmart = imd_anova(myobj, comparisons = comps2, test_method = "anova")

I tested by replacing with group names "A", "B", "C" and it seems to work fine.

pmartR v0.10.0 imd_test issue

Hi, having recently updated pmartR to v0.10.0, a colleague is having trouble running "imd_test" as the function appears to be missing. He had no trouble at all running exactly the same thing previously with v0.9.0.

Also, my quick fix was to suggest he revert to the older version of the software, but we couldn't find an archive version?

Thanks for your help, details below.

Input (where "p.norm" is a pepData dataset):

test <- data.frame(Control="Control",Test="Disease")
imd.res <- imd_test(p.norm, comparisons=test, pal_adjust="holm", pval_thresh=0.05)

Error message:

Error in imd_test(p.norm, comparisons = test, pval_adjust "holm", pval_thresh = 0.05) :
could not find function "imd_test"

Session info:

R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.1
Matrix products: default
BLAS:
/Library/Frameworks/R. framework/Versions/4.1/Resources/lib/libRblas.o.dylib
LAPACK:/Library/Frameworks/.framework/Versions/4.1/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] splines
stats
graphics grDevices utils
datasets methods
base
other attached packages:
[1] missForest_1.4
[6] gridExtra_2.3
[11] limma_3.50.0
[167 usethis_2.1.5
itertools 0.1-3
ggplotz_3.3.5
reshape_0.8.8
iterators 1.0.14
WGCNA_1.70-3
plyr_1.8.6
foreach 1.5.2
fastcluster_1.2.3
pmartR_0.10.0
loaded via a namespace (and not attached):
[1] bitops_1.0-7
matrixStats_0.61.0
[6] RColorBrewer_1.1-2
httr-1.4.2
[11] tools_4.1.2
utf8 1.2.2
[16] DBI_1.1.2
BiocGenerics 0.40.0
[211 tidyselect_1.1.1
prettyunits_1.1.1
[26] compiler_4.1.2
cli_3.2.0
[317 checkmate_2.0.0
scales_1.1.1
[36] foreign_0.8-82
XVector_0.32.0
[41] pkgconfig_2.0.3
sessioninfo_1.2.2
[467 impute_1.66.0
rstudioapi_0.13
[51] RCurl_1.98-1.6
magrittr_2.0.2
[56] Matrix_1.4-0
Rcpp_1.0.8
[61] lifecycle_1.0.1
stringi_1.7.6
[667 grid_4.1.2
blob 1.2.2
[71] Biostrings_2.60.2
KEGGREST_1.32.0
[76] codetools_0.2-18
stats4_4.1.
[81] data. table_1.14.2
remotes_2.4.2
[86] qtable_0.3.0
purrr_0.3.4
[917 tibble 3.1.6
AnnotationDbi_1.54.1
[96] ellipsis_0.3.2
fs_1.5.2
rprojroot_2.0.2
R6_2.5.1
colorspace 2.0-2
processx_3.5.2
Biobase_2.54.0
callr 3.7.0
htmltools_0.5.2
fastmap_1.1.0
RSOLite 2.2.10
GO. db_3.13.0
munse11_0.5.0
zlibbioc 1.38.0
parallel_4.1.2
knitr 1.37
pkgload_1.2.4
png_0.1-7
cachem_1.0.6
memoise_2.0.1
randomForest4.7-1
dynamicTreeCut_1.63-1
devtools_2.4.3
bit64_4.0.5
doParallel_1.0.17
GenomeInfoDb_1.28.4
backports_1.4.1
rpart_4.1.16
Hmisc_4.6-0
net 7.3-17
withr 2.4.3
preprocessore_1.54.0 bit 4.0.4
htmlTable_2.4.0
desc_1.4.0
stringr_1.4.0
digest_0.6.29
base64enc_0.1-3
jpeg_0.1-9
htmlwidgets_1.5.4
rlang_1.0.1
generics 0.1.2
dplyr_1.0.8
GenomeInfoDbData1.2.6 Formula_1.2-4
S4Vectors 0.30.2
fansi1.0.2
brio 1.1.3
pkgbuild_1.3.1
crayon_1.5.0
lattice 0.20-45
DS_1.6.0
pillar 1.7.0
glue_1.6.1
latticeExtra_0.6-29
vetrs_0.3.8
testthat_3.1.2
xfun 0. 29
survival_3.2-13
IRanges_2.26.0
cluster_2.1.2
.

Retire gridExtra/grid.arrange

Replace gridExtra with patchwork https://patchwork.data-imaginist.com/index.html for making side-by-side ggplots. Reasoning is that the objects returned by gridExtra are difficult to deal with, especially in shiny apps that want to save the plots.

Mostly it would be that all instances like gridExtra::grid.arrange(p, q, ncol = 2) are replaced by p + q

Two sample filters with the same sample erase the object

I'll get an mrep written up, but basically with rna-seq data (probably happens in the others), if I apply an RNAfilt that remove sample A1, and then another RNAfilt that ALSO removes sample A1, the object gets blown up (resulting object has 0 samples).

Usually this doesn't happen in an R coding session because the user makes the second filter after applying the first, so the sample doesn't exist, but in the app they are all created at once, so two filters can be targeting the same samples for removal.

Error in nonmissing_per_group count assignment when edata and fdata sample ordering differs

When the samples (columns) of e_data are not consistent in ordering with the sample ordering in f_data, group labels may be incorrectly assigned to counts computed by nonmissing_per_group. Please refer to the reproducible example below for reference.

edat <- data.frame(PepID = c("A", "B"), group1_time1 = c(NA, 11), group2_time1 = c(22, NA), group1_time2 = c(33, NA), group2_time2 = c(44, NA))

emeta <- data.frame(PepID = c("A", "B"), Protein = c("PA", "PB"), Peptide = c("pA", "pB"))

fdat <- data.frame(SampleID = c("group1_time2", "group1_time1", "group2_time2", "group2_time1"),
sGroup = c("G1", "G1", "G2", "G2"),
sTime = c("T2", "T1", "T2", "T1"))

pmdat <- as.pepData(e_data = edat, f_data = fdat, e_meta = emeta,
edata_cname = "PepID", emeta_cname = "Protein", fdata_cname = "SampleID",
data_scale = "abundance")
pmdat <- group_designation(pmdat, main_effects = c("sGroup", "sTime"))

nonmissing_per_group(pmdat)

Error in fold change computation when control group name is a substring of the test group name

Please see the reproducible example below for a demonstration of the issue:

temp_edata <- pmartRdata::pep_edata
colnames(temp_edata) <- gsub("B", "AB", colnames(temp_edata))

temp_fdata <- pmartRdata::pep_fdata
temp_fdata$SampleID <- gsub("B", "AB", temp_fdata$SampleID)
temp_fdata$SecondPhenotype <- gsub("B", "AB", temp_fdata$SecondPhenotype)

all(temp_fdata$SampleID %in% colnames(temp_edata)[-1])

temp_pdat <- as.pepData(e_data = temp_edata, f_data = temp_fdata,
edata_cname = "Peptide", fdata_cname = "SampleID",
data_scale = "abundance")

temp_pdat <- edata_replace(temp_pdat, x = 0, y = NA)

temp_pdat <- edata_transform(temp_pdat, data_scale = "log2")

temp_pdat <- group_designation(temp_pdat, main_effects = c("Phenotype", "SecondPhenotype"))

temp_pdat_norm <- normalize_global(omicsData = temp_pdat,
subset_fn = "all",
norm_fn = "median",
apply_norm = TRUE,
backtransform = TRUE)

temp_gdf <- attr(temp_pdat, "group_DF") %>%
dplyr::select(-SampleID) %>%
dplyr::distinct()

temp_comps <- data.frame(Control = c(temp_gdf %>% dplyr::filter(Phenotype == "Phenotype3", SecondPhenotype == "A") %>% .$Group),
Test = c(temp_gdf %>% dplyr::filter(Phenotype == "Phenotype3", SecondPhenotype == "AB") %>% .$Group))

temp_res <- imd_anova(omicsData = temp_pdat_norm, comparisons = temp_comps, test_method = "anova",
pval_adjust_a = "none", pval_adjust_g = "none", pval_thresh = 0.05)

all.equal(temp_res$Mean_Phenotype3_AB - temp_res$Mean_Phenotype3_A,
temp_res$Fold_change_Phenotype3_AB_vs_Phenotype3_A)
head(temp_res$Mean_Phenotype3_AB - temp_res$Mean_Phenotype3_A)
head(temp_res$Fold_change_Phenotype3_AB_vs_Phenotype3_A)

carry over the custom_sampnames attribute into objects such as corresponding

If custom_sampnames() has been run on the omicsData object, this attribute should be somehow passed through to become part of e.g. corRes objects so that it can be used when plotting. Currently, plot.corRes errors and says that it must also be given the omicsData object with the corresponding VizSampNames, but then when that is passed to plot(), the resulting plot does not display any sample names at all.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.