Giter Site home page Giter Site logo

extratests's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

franzbischoff

extratests's Issues

Thinking about how we test parsnip model registration with extension packages

This is related to #185. Which was darn hard to figure out.

What happens is that the .onload() of baguette will trigger in a test. Which is what we want. But it will stay triggered for any of the following tests. Such that we get different results sometimes depending on if we test all files or one file. Also, it violates some soft of test smell, since the order of tests matter

library(parsnip)
library(testthat)

parsnip::get_model_env()$bag_tree_args
#> # A tibble: 0 × 5
#> # ℹ 5 variables: engine <chr>, parsnip <chr>, original <chr>, func <list>,
#> #   has_submodel <lgl>

# could be in a previous file
test_that("load baguette", {
  library(baguette)
})
#> ── Skip: load baguette ─────────────────────────────────────────────────────────
#> Reason: empty test

parsnip::get_model_env()$bag_tree_args
#> # A tibble: 6 × 5
#>   engine parsnip         original func             has_submodel
#>   <chr>  <chr>           <chr>    <list>           <lgl>       
#> 1 rpart  class_cost      cost     <named list [2]> FALSE       
#> 2 rpart  tree_depth      maxdepth <named list [2]> FALSE       
#> 3 rpart  min_n           minsplit <named list [2]> FALSE       
#> 4 rpart  cost_complexity cp       <named list [2]> FALSE       
#> 5 C5.0   class_cost      cost     <named list [2]> FALSE       
#> 6 C5.0   min_n           minCases <named list [2]> FALSE

GHA failures due to aorsf archival

GH R-CMD-CHECK is failing with:

  Error: 
  ! error in pak subprocess
  Caused by error: 
  ! Could not solve package dependencies:
  * deps::.: Can't install dependency aorsf
  * aorsf: Can't find package called aorsf.
  * any::sessioninfo: dependency conflict
  * any::rcmdcheck: dependency conflict

as aorsf got booted from CRAN. Let's configure so we install from GitHub for now.

Adding tests that make sure tuneable steps work in external packages

I have problems in textrecipes and themis lately with not getting tuneable steps to work.

I have a feeling it has something to with .onLoad() but it is hard to test within the packages individually.

Here is an example of it not working in themis. I feel like it used to work at one point.

library(tidymodels)
library(themis)
data("credit_data")

ds_rec <- recipe(Status ~ Age + Income + Assets, data = credit_data) %>%
  step_meanimpute(all_predictors()) %>%
  step_bsmote(Status, all_neighbors = tune())

mod_spec <- null_model() %>%
  set_engine("parsnip")

wf_spec <- workflow() %>%
  add_model(mod_spec) %>%
  add_recipe(ds_rec)

folds <- vfold_cv(credit_data, v = 2)

tune_grid(wf_spec, resamples = folds, grid = grid_regular(all_neighbors()))
#> Warning: No tuning parameters have been detected, performance will be evaluated
#> using the resamples with no tuning. Did you want [fit_resamples()]?
#> x Fold1: recipe: Error: You cannot `prep()` a tuneable recipe. Argument(s) with `...
#> x Fold2: recipe: Error: You cannot `prep()` a tuneable recipe. Argument(s) with `...
#> Warning: All models failed in tune_grid(). See the `.notes` column.
#> Warning: This tuning result has notes. Example notes on model fitting include:
#> recipe: Error: You cannot `prep()` a tuneable recipe. Argument(s) with `tune()`: 'all_neighbors'. Do you want to use a tuning function such as `tune_grid()`?
#> recipe: Error: You cannot `prep()` a tuneable recipe. Argument(s) with `tune()`: 'all_neighbors'. Do you want to use a tuning function such as `tune_grid()`?
#> # Tuning results
#> # 2-fold cross-validation 
#> # A tibble: 2 x 4
#>   splits              id    .metrics .notes          
#>   <list>              <chr> <list>   <list>          
#> 1 <split [2.2K/2.2K]> Fold1 <NULL>   <tibble [1 × 1]>
#> 2 <split [2.2K/2.2K]> Fold2 <NULL>   <tibble [1 × 1]>

Created on 2020-08-07 by the reprex package (v0.3.0)

Upkeep for `test-recipes-varying.R`

varying() has been replaced by tune() and varying_args() by tune_args() (which has tests in test-tune_args.R).

It'd be good to check if any of the tests here should be updated (and possibly migrated to live either in test-tune_args.R or in recipes) or if they could be deleted.

`GH-R-CMD-CHECK` workflow overwrites branch dev installs

GH-R-CMD-CHECK uses try(pak::pkg_install()) for each dev package:

try(pak::pkg_install("tidymodels/baguette"))
try(pak::pkg_install("tidymodels/bonsai"))
try(pak::pkg_install("tidymodels/censored"))
try(pak::pkg_install("tidymodels/dials"))
try(pak::pkg_install("tidymodels/discrim"))
try(pak::pkg_install("tidymodels/hardhat"))
try(pak::pkg_install("tidymodels/modeldata"))
try(pak::pkg_install("tidymodels/multilevelmod"))
try(pak::pkg_install("tidymodels/parsnip"))
try(pak::pkg_install("tidymodels/plsmod"))
try(pak::pkg_install("tidymodels/poissonreg"))
try(pak::pkg_install("tidymodels/recipes"))
try(pak::pkg_install("tidymodels/rsample"))
try(pak::pkg_install("tidymodels/rules"))
try(pak::pkg_install("tidymodels/spatialsample"))
try(pak::pkg_install("tidymodels/stacks"))
try(pak::pkg_install("tidymodels/themis"))
try(pak::pkg_install("tidymodels/tidymodels"))
try(pak::pkg_install("tidymodels/tune"))
try(pak::pkg_install("tidymodels/workflows"))
try(pak::pkg_install("tidymodels/yardstick"))
try(pak::pkg_install("tidymodels/finetune"))
try(pak::pkg_install("business-science/modeltime"))

This is nice in that we don't have to deal with known dependency resolution issues, but also means that installs can be overwritten by later ones, e.g. in #198, making it difficult to test against dev branches.

We could:

  1. Try to install in an order that minimizes overwritten installs, and/or
  2. Allow package dependency resolution for core packages by passing them as a vector to pak::pkg_install(), like:
try(pak::pkg_install(paste0("tidymodels/", c("rsample", "recipes", "parsnip", "dials", "workflows", "tune"))))

This would mean that we're alerted when packages have circular dev dependencies, which is nice in the case where we did that accidentally but would require making temporary changes every time we do this knowingly.

Separate survival tests for `tune_race_anova()` and `tune_race_win_loss()`

The current structure has one file per tuning function with 4 testthat() expressions for 4 different metric types. Each expression does the tuning and then tests the structure of the results (a tune_results object) as well as a bunch of methods for the tune_results object.

The tests for racing currently execute both tune_race_anova() and tune_race_win_loss() within one test_that() call and then check everything twice, basically. I think it's worth splitting those apart so that we can more easily tell which one of the two tuning functions a failure is related to. I'd put them in two different files, too, to follow the structure for the other tuning functions.

bag_tree fit call

Current GH CI is failing with

══ Failed ══════════════════════════════════════════════════════════════════════
── 1. Failure ('test-censored-case-weights.R:36:3'): bag_tree - rpart censored c
Snapshot of code has changed:
old[2:5] vs new[2:5]
    wt_fit$fit$call
  Output
    bagging.data.frame(formula = Surv(time, event) ~ ., data = data, 
-       weights = weights)
+       weights = weights, cp = ~0, minsplit = ~2)

We have not been able to reproduce locally.

namespace `slice()`

I see a good bit of:

Error (test-survival-tune_race_anova.R:208:3): race tuning (anova) survival models with static metric
Error in `UseMethod("slice")`: no applicable method for 'slice' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"
Backtrace:
    ▆
 1. ├─testthat::expect_equal(metric_all %>% slice(), exp_metric_all) at test-survival-tune_race_anova.R:208:3
 2. │ └─testthat::quasi_label(enquo(object), label, arg = "object")
 3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
 4. ├─metric_all %>% slice()
 5. └─xgboost::slice(.)

when I run tests locally.

Test `fit_resamples()` for survival models

What we already have: nothing for survival models

What we still want: something that emulates what we are doing for tuning (since this is a subset of our tuning methods).

I've added a template in the work for #117 since it will be easier for reviewers to do both at once.

Move `master` branch to `main`

The master branch of this repository will soon be renamed to main, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.

That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master --> main change.

The purpose of this issue is to:

  • Help us firm up the list of targetted repositories
  • Make sure all maintainers are aware of what's coming
  • Give us an issue to close when the job is done
  • Give us a place to put advice for collaborators re: how to adapt

message id: euphoric_snowdog

skip for dev workflows version is higher than it ought to be

In skips, I see:

── Skipped tests ──
Installed workflows is version 1.1.3; but 1.1.3.9007 is required (1):
  test-parsnip-survival-censoring-weights.R:225:3

The current dev workflows version is 1.1.3.9001, so this test isn't being run and likely ought to be.

new failures with glmnet penalty checks

Including on the CRAN R CMD CHECK workflow, we're seeing:

  old[2:6] vs new[2:6]
      multinom_reg(penalty = 0.01) %>% set_engine("glmnet") %>% fit(class ~ ., data = hpc_data) %>%
        predict(hpc_data, penalty = 0:1)
    Condition
  -   Error in `.check_glmnet_penalty_predict()`:
  +   Error:
  -   ! `penalty` should be a single numeric value. `multi_predict()` can be used to get multiple predictions per row of data.
  +   ! argument "..3" is missing, with no default

CRAN checks for parsnip and censored are fine.

Test `collect_metrics()` for survival models

We already have about 18 invocations of collect_metrics() but none for survival models. There is a TODO at the time of the initial tests but eval_time was not implemented at the time.

What we still want: survival-specific examples that look across different metric types and different cases for eval_time.

Inspect spark tests

Are they running as intended?

Several are being skipped, e.g.:

  1. (code run outside of test_that()) ('test-spark-linear-reg.R:2:1') - Reason: spark_not_installed() is TRUE

Improve plot/snapshot for `autoplot()` method on racing with survival models

The corresponding issue is tidymodels/tune#755. The issue mentioned in the comments is a duplicate of 755.

# TODO make better plot at resolution of https://github.com/tidymodels/tune/issues/754
# expect_snapshot_plot(
# print(autoplot(aov_mixed_res, eval_time = c(1, 5))),
# "mix-aov-race-2-times"
# )

and

# TODO make better plot at resolution of https://github.com/tidymodels/tune/issues/754
# expect_snapshot_warning(
# expect_snapshot_plot(
# print(autoplot(aov_mixed_res)),
# "mix-aov-race-0-times"
# )
# )

Meta issue for survival analysis in tidymodels

Functionality/documentation

To discuss

Testing

Reduce messages in test output

The output for test failures is very noisy, making it hard to spot the actual failures amidst all the messages.

test_check("extratests", reporter = "summary")
parsnip-extension-messaging: ....Loading required package: survival
...................
S3-registration: ......
butcher-recipes: 
Attaching package: 'Matrix'

The following objects are masked from 'package:tidyr':

    expand, pack, unpack

....
encodings-glmnet: Loaded glmnet 4.1-8
.........
encodings-randomForest: ......
encodings-ranger: ......
engine-parameters-c5: ..! The Gaussian process model is being fit using 3 features but only has 3
  data points to do so. This may cause errors or a poor model fit.
! The Gaussian process model is being fit using 3 features but only has 4
  data points to do so. This may cause errors or a poor model fit.
....! The Gaussian process model is being fit using 3 features but only has 3
  data points to do so. This may cause errors or a poor model fit.
! The Gaussian process model is being fit using 3 features but only has 4
  data points to do so. This may cause errors or a poor model fit.
..
engine-parameters-earth: ..! The Gaussian process model is being fit using 2 features but only has 3
  data points to do so. This may cause errors or a poor model fit.
..

Missing snapshots for `test-recipes-nnmf_sparse.R`

5. Correct values ('test-recipes-nnmf_sparse.R:21:3') - Adding new snapshot for variant 'recipes1.0.9.9000':
Code
  print(rec)
Message
  
  -- Recipe ----------------------------------------------------------------------
  
  -- Inputs 
  Number of variables by role
  outcome:   1
  predictor: 4
  
  -- Operations 
  * Non-negative matrix factorization for: all_predictors()

6. No NNF ('test-recipes-nnmf_sparse.R:43:3') - Adding new snapshot for variant 'recipes1.0.9.9000':
Code
  print(rec)
Message
  
  -- Recipe ----------------------------------------------------------------------
  
  -- Inputs 
  Number of variables by role
  outcome:   1
  predictor: 4
  
  -- Training information 
  Training data contained 150 data points and no incomplete rows.
  
  -- Operations 
  * No non-negative matrix factorization was extracted from: Sepal.Length,
    Sepal.Width, Petal.Length, Petal.Width | Trained

Audit `skip()` with `minimum_version` requirements

If we are past all the required versions on CRAN and RSPM, we don't really need those skip statements anymore, do we? Is there a case where it would install an older version and break again?

If we don't need the skip statements anymore, I suggest we remove them (even if they don't "hurt") so that we have one less thing to pay attention to when we look at the tests i.e. lowering the cognitive load 😄

Test fit interfaces called from workflow for survival models

Let's test that we can use both the formula and the matrix interface to the underlying parsnip model, i.e., trigger fit() and fit_xy() from a workflow, by supplying a model formula.

This is a good opportunity to test stratification since strata() is a specials term that needs to go into the model formula arg of add_model(). Both engines for the proportional_hazards() model, "survival" and "glmnet", support stratification.

We should also test that prediction works, regardless of which fit interface was used.

Reduce warnings about non-portable files

In the GHA logs, there are plenty of warnings about non-portable files which we should inspect and either fix or silence, if possible, because it's making the output very noisy.

Example from R CMD build stage (as part of check-r-package):

Warning in utils::tar(filepath, pkgname, compression = compression, compression_level = 9L, :
storing paths of more than 100 bytes is not portable:
‘extratests/tests/testthat/_snaps/survival-tune-bayes/dynamic-metric-bayes-search-with-two-time-points-marginals.png’

Then, in R CMD check:

  • checking for portable file names ... NOTE
    Found the following non-portable file paths:
    extratests/tests/testthat/_snaps/survival-tune-bayes/dynamic-metric-bayes-search-with-two-time-points-marginals.png

And again in R CMD check results:

❯ checking for portable file names ... NOTE
Found the following non-portable file paths:
extratests/tests/testthat/_snaps/survival-tune-bayes/dynamic-metric-bayes-search-with-two-time-points-marginals.png

With the following note

Tarballs are only required to store paths of up to 100 bytes and cannot
store those of more than 256 bytes, with restrictions including to 100
bytes for the final component.
See section ‘Package structure’ in the ‘Writing R Extensions’ manual.

Since the files are part of snapshot tests, are these tests carried out?

Problem with using `tidy()` (S3 registration) on Windows in parallel

Hi,all,
I'm reading this book to get to know with tidymodels, and get an problem with running parallel resampling on Windows.
As to the examples in Chapter 10, the main parallel part of fit_resamples() is O.K. with library doParallel on Windows, but only one issue with extract_fit_parsnip(x):

library(tidymodels)
# All operating systems
library(doParallel)
library(kableExtra)
library(tidyr)
tidymodels_prefer()

data(ames)
ames <- mutate(ames, Sale_Price = log10(Sale_Price))

set.seed(502)
ames_split <- initial_split(ames, prop = 0.80, strata = Sale_Price)
ames_train <- training(ames_split)
ames_test  <-  testing(ames_split)

ames_rec <- 
  recipe(Sale_Price ~ Neighborhood + Gr_Liv_Area + Year_Built + Bldg_Type + 
           Latitude + Longitude, data = ames_train) %>%
  step_log(Gr_Liv_Area, base = 10) %>% 
  step_other(Neighborhood, threshold = 0.01) %>% 
  step_dummy(all_nominal_predictors()) %>% 
  step_interact( ~ Gr_Liv_Area:starts_with("Bldg_Type_") ) %>% 
  step_ns(Latitude, Longitude, deg_free = 20)

lm_model <- linear_reg() %>% set_engine("lm")

lm_wflow <- 
  workflow() %>% 
  add_model(lm_model) %>% 
  add_recipe(ames_rec)

lm_fit <- fit(lm_wflow, ames_train)

# This line is O.K.
extract_fit_parsnip(lm_fit) %>% tidy()

# ------------------------------------------------------------------------------------
set.seed(1001)
ames_folds <- vfold_cv(ames_train, v = 10)

# Create a cluster object and then register: 
# cl <- makePSOCKcluster(parallel::detectCores())
cl <- makePSOCKcluster(10)
registerDoParallel(cl)

get_model <- function(x) {
  # not O.K. on Windows & Linux.
  extract_fit_parsnip(x) %>% tidy()
  # This line is O.K.
  # extract_recipe(x, estimated = TRUE)
}

ctrl <- control_resamples(save_pred=TRUE,verbose=TRUE, extract = get_model)
set.seed(1003)
lm_res <- lm_wflow %>% fit_resamples(resamples = ames_folds, control = ctrl)

# Stop parallel
stopCluster(cl)

#  These lines are O.K.
lm_res
lm_res$.metrics[[1]]
lm_res$.notes[[1]]
lm_res$.predictions[[1]]
> lm_res$.extracts[[1]]
# A tibble: 1 x 2
  .extracts      .config             
  <list>         <chr>               
1 <try-errr [1]> Preprocessor1_Model1

> # To get the results
> lm_res$.extracts[[1]][[1]]
[[1]]
[1] "Error in UseMethod(\"tidy\") : \n  no applicable method for 'tidy' applied to an object of class \"lm\"\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in UseMethod("tidy"): no applicable method for 'tidy' applied to an object of class "lm">

Any idea?

Best regards.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.