Giter Site home page Giter Site logo

pcsstools's People

Contributors

jackmwolf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

pcsstools's Issues

model_combo() does not label coefficients correctly

When modeling a function with only one predictor, calculate_lm() does not label "(Intercept)" in output coefficients.
The following example shows output from model_combo(), which calls calculate_lm().

library(grass)
ex_data <- cont_data

means <- colMeans(ex_data)
covs <- cov(ex_data)
n <- nrow(ex_data)

phi <- c(1, 1)

model_combo(y1 + y2 ~ x, n = n, phi = phi, means = means, covs = covs)
#> Model approximated using Pre-Computed Summary Statistics.
#> 
#> Call:
#> model_combo(formula = y1 + y2 ~ x, phi = phi, n = n, means = means, 
#>     covs = covs)
#> 
#> Coefficients:
#>   Estimate Std. Error t value Pr(>|t|)    
#>   -0.50990    0.03349  -15.22   <2e-16 ***
#> x -0.89016    0.03287  -27.08   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.059 on 998 degrees of freedom
#> Multiple R-squared:  0.4236, Adjusted R-squared:  0.423 
#> F-statistic: 733.3 on 1 and 998 DF,  p-value: < 2.2e-16

summary(lm(y1 + y2 ~ x, data = cont_data))
#> 
#> Call:
#> lm(formula = y1 + y2 ~ x, data = cont_data)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -3.2948 -0.7576 -0.0582  0.7384  3.4200 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) -0.50990    0.03349  -15.22   <2e-16 ***
#> x           -0.89016    0.03287  -27.08   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 1.059 on 998 degrees of freedom
#> Multiple R-squared:  0.4236, Adjusted R-squared:  0.423 
#> F-statistic: 733.3 on 1 and 998 DF,  p-value: < 2.2e-16

Release pcsstools 0.1.2

Prepare for release:

  • git pull
  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • git push

Submit to CRAN:

  • usethis::use_version('patch')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted ๐ŸŽ‰
  • git push
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • git push

new_predictor_* functions do not check input

The new_predictor_* functions do not validate user input. The following examples should result in an error or at least a warning. However, they currently run without issue.

library(pcsstools)

x <- new_predictor_snp(maf = NA)
z <- new_predictor_binary(p = 100)
y <- new_predictor_normal(mean = NA, sd = -1)

approx_mult_prod is sensitive to variable order

The output of approx_mult_prod() changes based on the order of response means/covariances.
The returned lists from both of the approx_mult_prod() statements in the reprex below should be equal but they are not.

library(grass)
ex_data <- bin_data[c("g", "x", "y1", "y2", "y3")]
head(ex_data)
#>   g          x y1 y2 y3
#> 1 0 -0.9161478  1  0  1
#> 2 0  1.2496985  0  1  0
#> 3 1 -1.2708514  0  0  0
#> 4 2  0.0832760  0  1  0
#> 5 0  0.4686342  0  1  1
#> 6 2  0.4620154  0  1  0

means <- colMeans(ex_data)
covs <- cov(ex_data)
n <- nrow(ex_data)

predictors <- list(
  new_predictor_snp(maf = mean(ex_data$g) / 2),
  new_predictor_normal(mean = mean(ex_data$x), sd = sd(ex_data$x))
)
responses <- lapply(means[3:length(means)], new_predictor_binary)

approx_mult_prod(means, covs, n, response = "binary",
  predictors = predictors, responses = responses, verbose = TRUE)
#> Approximating with responses ordered as:  y1 * y2 * y3 
#> Approximating with responses ordered as:  y1 * y3 * y2 
#> Approximating with responses ordered as:  y2 * y3 * y1
#> $means
#>           g           x      y1y2y3 
#>  0.56800000 -0.02927950  0.05444547 
#> 
#> $covs
#>                  g           x      y1y2y3
#> g       0.40978579 -0.04510754 -0.02670105
#> x      -0.04510754  0.99460726  0.04906614
#> y1y2y3 -0.02670105  0.04906614  0.05153269


# Reorder response means/covariances
means <- means[c(1, 2, 5, 4, 3)]
covs  <- covs[c(1, 2, 5, 4, 3), c(1, 2, 5, 4, 3)]

responses <- lapply(means[3:length(means)], new_predictor_binary)

approx_mult_prod(means, covs, n, response = "binary",
                 predictors = predictors, responses = responses, verbose = TRUE)
#> Approximating with responses ordered as:  y3 * y2 * y1 
#> Approximating with responses ordered as:  y3 * y1 * y2 
#> Approximating with responses ordered as:  y2 * y1 * y3
#> $means
#>           g           x      y3y2y1 
#>  0.56800000 -0.02927950  0.08101557 
#> 
#> $covs
#>                  g           x      y3y2y1
#> g       0.40978579 -0.04510754 -0.03324090
#> x      -0.04510754  0.99460726  0.05203570
#> y3y2y1 -0.03324090  0.05203570  0.07452658

Created on 2020-08-05 by the reprex package (v0.3.0)

approx_conditional() can be simplified

approx_conditional() uses an equation that can be heavily reduced to estimate the conditional variance of a phenotype.

p_s2 <- (n * means[2]^2 + (n - 1) * covs[2, 2] - a * n * means[2] -
b * (n * means[1] * means[2] + (n - 1) * covs[1, 2])) / (n - 2)

This can be reduced to:

p_s2 <- (n-1) * (covs[2, 2] - b * covs[1, 2]) / (n - 2)

model_or() and model_and() do not label coefficients correctly with one predictor

There is a similar but different issue using either model_or() or model_and(). Models with only one predictor will label said predictor as NA.

library(grass)

ex_data <- bin_data

means <- colMeans(ex_data)
covs <- cov(ex_data)
n <- nrow(ex_data)
predictors <- list(
 g = new_predictor_snp(maf = mean(ex_data$g) / 2),
 x = new_predictor_normal(mean = mean(ex_data$x), sd = sd(ex_data$x))
)

model_and(
 y1 & y2 ~ g,
 means = means, covs = covs, n = n, predictors = predictors
)
#> Model approximated using Pre-Computed Summary Statistics.
#> 
#> Call:
#> model_and(formula = y1 & y2 ~ g, n = n, means = means, covs = covs, 
#>     predictors = predictors)
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  0.19601    0.01382  14.179  < 2e-16 ***
#> NA          -0.11797    0.01616  -7.301 5.82e-13 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.3269 on 998 degrees of freedom
#> Multiple R-squared:  0.05071,    Adjusted R-squared:  0.04976 
#> F-statistic: 53.31 on 1 and 998 DF,  p-value: 5.819e-13

Originally posted by @jackmwolf in #3 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.