tidymodels / usemodels Goto Github PK

View Code? Open in Web Editor NEW

84.0 6.0 3.0 1.26 MB

Boilerplate Code for tidymodels

Home Page: https://usemodels.tidymodels.org

License: Other

R 100.00%

usemodels's Introduction

usemodels

The usemodels package is a helpful way of quickly creating code snippets to fit models using the tidymodels framework.

Given a simple formula and a data set, the use_* functions can create code that appropriate for the data (given the model).

For example, using the palmerpenguins data with a glmnet model:

> library(usemodels)
> library(palmerpenguins)
> data(penguins)
> use_glmnet(body_mass_g ~ ., data = penguins)
glmnet_recipe <- 
  recipe(formula = body_mass_g ~ ., data = penguins) %>% 
  step_novel(all_nominal_predictors()) %>% 
  step_dummy(all_nominal_predictors()) %>% 
  step_zv(all_predictors()) %>% 
  step_normalize(all_numeric_predictors()) 

glmnet_spec <- 
  linear_reg(penalty = tune(), mixture = tune()) %>% 
  set_mode("regression") %>% 
  set_engine("glmnet") 

glmnet_workflow <- 
  workflow() %>% 
  add_recipe(glmnet_recipe) %>% 
  add_model(glmnet_spec) 

glmnet_grid <- tidyr::crossing(penalty = 10^seq(-6, -1, length.out = 20), mixture = c(0.05, 
    0.2, 0.4, 0.6, 0.8, 1)) 

glmnet_tune <- 
  tune_grid(glmnet_workflow, resamples = stop("add your rsample object"), grid = glmnet_grid)

The recipe steps that are used (if any) depend on the type of data as well as the model. In this case, the first two steps handle the fact that Species is a factor-encoded predictor (and glmnet requires all numeric predictors). The last two steps are added because, for this model, the predictors should be on the same scale to be properly regularized.

The package includes these templates:

> ls("package:usemodels", pattern = "use_")
 [1] "use_bag_tree_rpart"   "use_C5.0"             "use_cubist"          
 [4] "use_dbarts"           "use_earth"            "use_glmnet"          
 [7] "use_kernlab_svm_poly" "use_kernlab_svm_rbf"  "use_kknn"            
[10] "use_mgcv"             "use_mixOmics"         "use_nnet"            
[13] "use_ranger"           "use_rpart"            "use_xgboost"         
[16] "use_xrf"

You can also copy code to the clipboard using the option clipboard = TRUE.

Installation

You can install usemodels with:

devtools::install_github("tidymodels/usemodels")

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

For questions and discussions about tidymodels packages, modeling, and machine learning, please post on RStudio Community.
If you think you have encountered a bug, please submit an issue.
Either way, learn how to create and share a reprex (a minimal, reproducible example), to clearly communicate about your code.
Check out further details on contributing guidelines for tidymodels packages and how to get help.

usemodels's People

Contributors

Stargazers

Watchers

Forkers

vinchinzu sandeshregmi qiushiyan

usemodels's Issues

Add catboost

Hello,
I do not have a snippet to provide, but it would be nice (perhaps as a Xmas present) to have usemodels also include catboost (see https://catboost.ai/). Thanks!

Add use_mulitperceptron?

This is great so far. I'd request use_multiperceptron, if that can be added.

Is the eventual goal to have all of the supported tidymodel types in here? Thanks so much.

use `all_nomial_predictors()`

use all_nominal_predictors() instead of all_nominal(), -all_outcomes()

update recipes imports

Some methods have been moved form tune to recipes. When usemodels is documented, the warning appears:

Warning message:
replacing previous import ‘recipes::tune_args’ by ‘tune::tune_args’ when loading ‘usemodels’

Clipboard and/or RStudio API support

I really like the idea behind this package and was thinking the user experience could be improved again by having the code generated also copied to the clipboard (probably using {clipr}) and/or using {rstudioapi} to insert it into the active document.

I'm not sure if this is an idea that has been canvassed before and rejected, but if there's support I'm happy to take a look at turning my hand to writing it.

`use_xgboost()` uses only 6/8 possible tuning parameters

use_xgboost() only uses 6 of the 8 possible tuning parameters (i.e. mtry and stop_iter are not tune()d).
Is that a deliberate choice (if so, could/should be documented?) or an oversight?
Or am I just missing something?

library(usemodels)
library(tidymodels, warn.conflicts = FALSE)
data(ames)

ames <-
  ames |>
  select(
    Sale_Price,
    Neighborhood,
    Gr_Liv_Area,
    Year_Built,
    Bldg_Type,
    Latitude,
    Longitude
  ) |> 
  mutate(Sale_Price = log10(Sale_Price))

ames_split <- initial_split(ames, prop = 0.80)
ames_train <- training(ames_split)
ames_test  <- testing(ames_split)

use_xgboost(
  Sale_Price ~ Neighborhood + Gr_Liv_Area + Year_Built + Bldg_Type + Latitude + Longitude, 
  data = ames_train
)
#> xgboost_recipe <- 
#>   recipe(formula = Sale_Price ~ Neighborhood + Gr_Liv_Area + Year_Built + Bldg_Type + 
#>     Latitude + Longitude, data = ames_train) %>% 
#>   step_novel(all_nominal_predictors()) %>% 
#>   step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% 
#>   step_zv(all_predictors()) 
#> 
#> xgboost_spec <- 
#>   boost_tree(trees = tune(), min_n = tune(), tree_depth = tune(), learn_rate = tune(), 
#>     loss_reduction = tune(), sample_size = tune()) %>% 
#>   set_mode("regression") %>% 
#>   set_engine("xgboost") 
#> 
#> xgboost_workflow <- 
#>   workflow() %>% 
#>   add_recipe(xgboost_recipe) %>% 
#>   add_model(xgboost_spec) 
#> 
#> set.seed(8291)
#> xgboost_tune <-
#>   tune_grid(xgboost_workflow, resamples = stop("add your rsample object"), grid = stop("add number of candidate points"))

Session info

sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23 ucrt)
#>  os       Windows 10 x64 (build 19044)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language en
#>  collate  German_Germany.utf8
#>  ctype    German_Germany.utf8
#>  tz       Europe/Berlin
#>  date     2022-08-12
#>  pandoc   2.18 @ C:/Program Files/RStudio/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date (UTC) lib source
#>  assertthat     0.2.1      2019-03-21 [1] CRAN (R 4.2.0)
#>  backports      1.4.1      2021-12-13 [1] CRAN (R 4.2.0)
#>  broom        * 1.0.0      2022-07-01 [1] CRAN (R 4.2.1)
#>  class          7.3-20     2022-01-16 [2] CRAN (R 4.2.1)
#>  cli            3.3.0      2022-04-25 [1] CRAN (R 4.2.0)
#>  codetools      0.2-18     2020-11-04 [2] CRAN (R 4.2.1)
#>  colorspace     2.0-3      2022-02-21 [1] CRAN (R 4.2.0)
#>  DBI            1.1.3      2022-06-18 [1] CRAN (R 4.2.0)
#>  dials        * 1.0.0      2022-06-14 [1] CRAN (R 4.2.0)
#>  DiceDesign     1.9        2021-02-13 [1] CRAN (R 4.2.0)
#>  digest         0.6.29     2021-12-01 [1] CRAN (R 4.2.0)
#>  dplyr        * 1.0.9      2022-04-28 [1] CRAN (R 4.2.0)
#>  ellipsis       0.3.2      2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate       0.16       2022-08-09 [1] CRAN (R 4.2.1)
#>  fansi          1.0.3      2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap        1.1.0      2021-01-25 [1] CRAN (R 4.2.0)
#>  foreach        1.5.2      2022-02-02 [1] CRAN (R 4.2.0)
#>  fs             1.5.2      2021-12-08 [1] CRAN (R 4.2.0)
#>  furrr          0.3.0      2022-05-04 [1] CRAN (R 4.2.0)
#>  future         1.27.0     2022-07-22 [1] CRAN (R 4.2.1)
#>  future.apply   1.9.0      2022-04-25 [1] CRAN (R 4.2.0)
#>  generics       0.1.3      2022-07-05 [1] CRAN (R 4.2.1)
#>  ggplot2      * 3.3.6      2022-05-03 [1] CRAN (R 4.2.0)
#>  globals        0.16.0     2022-08-05 [1] CRAN (R 4.2.1)
#>  glue           1.6.2      2022-02-24 [1] CRAN (R 4.2.0)
#>  gower          1.0.0      2022-02-03 [1] CRAN (R 4.2.0)
#>  GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.2.0)
#>  gtable         0.3.0      2019-03-25 [1] CRAN (R 4.2.0)
#>  hardhat        1.2.0      2022-06-30 [1] CRAN (R 4.2.1)
#>  highr          0.9        2021-04-16 [1] CRAN (R 4.2.0)
#>  htmltools      0.5.3      2022-07-18 [1] CRAN (R 4.2.1)
#>  infer        * 1.0.2      2022-06-10 [1] CRAN (R 4.2.0)
#>  ipred          0.9-13     2022-06-02 [1] CRAN (R 4.2.0)
#>  iterators      1.0.14     2022-02-05 [1] CRAN (R 4.2.0)
#>  knitr          1.39       2022-04-26 [1] CRAN (R 4.2.0)
#>  lattice        0.20-45    2021-09-22 [2] CRAN (R 4.2.1)
#>  lava           1.6.10     2021-09-02 [1] CRAN (R 4.2.0)
#>  lhs            1.1.5      2022-03-22 [1] CRAN (R 4.2.0)
#>  lifecycle      1.0.1      2021-09-24 [1] CRAN (R 4.2.0)
#>  listenv        0.8.0      2019-12-05 [1] CRAN (R 4.2.0)
#>  lubridate      1.8.0      2021-10-07 [1] CRAN (R 4.2.0)
#>  magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.2.0)
#>  MASS           7.3-58.1   2022-08-03 [1] CRAN (R 4.2.1)
#>  Matrix         1.4-1      2022-03-23 [2] CRAN (R 4.2.1)
#>  modeldata    * 1.0.0      2022-07-01 [1] CRAN (R 4.2.1)
#>  munsell        0.5.0      2018-06-12 [1] CRAN (R 4.2.0)
#>  nnet           7.3-17     2022-01-16 [2] CRAN (R 4.2.1)
#>  parallelly     1.32.1     2022-07-21 [1] CRAN (R 4.2.1)
#>  parsnip      * 1.0.0      2022-06-16 [1] CRAN (R 4.2.0)
#>  pillar         1.8.0      2022-07-18 [1] CRAN (R 4.2.1)
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.2.0)
#>  prodlim        2019.11.13 2019-11-17 [1] CRAN (R 4.2.0)
#>  purrr        * 0.3.4      2020-04-17 [1] CRAN (R 4.2.0)
#>  R.cache        0.16.0     2022-07-21 [1] CRAN (R 4.2.1)
#>  R.methodsS3    1.8.2      2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo           1.25.0     2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils        2.12.0     2022-06-28 [1] CRAN (R 4.2.1)
#>  R6             2.5.1      2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp           1.0.9      2022-07-08 [1] CRAN (R 4.2.1)
#>  recipes      * 1.0.1      2022-07-07 [1] CRAN (R 4.2.1)
#>  reprex         2.0.1      2021-08-05 [1] CRAN (R 4.2.0)
#>  rlang          1.0.4      2022-07-12 [1] CRAN (R 4.2.1)
#>  rmarkdown      2.14       2022-04-25 [1] CRAN (R 4.2.0)
#>  rpart          4.1.16     2022-01-24 [2] CRAN (R 4.2.1)
#>  rsample      * 1.1.0      2022-08-08 [1] CRAN (R 4.2.1)
#>  rstudioapi     0.13       2020-11-12 [1] CRAN (R 4.2.0)
#>  scales       * 1.2.0      2022-04-13 [1] CRAN (R 4.2.0)
#>  sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.2.0)
#>  stringi        1.7.8      2022-07-11 [1] CRAN (R 4.2.1)
#>  stringr        1.4.0      2019-02-10 [1] CRAN (R 4.2.0)
#>  styler         1.7.0      2022-03-13 [1] CRAN (R 4.2.0)
#>  survival       3.4-0      2022-08-09 [1] CRAN (R 4.2.1)
#>  tibble       * 3.1.8      2022-07-22 [1] CRAN (R 4.2.1)
#>  tidymodels   * 1.0.0      2022-07-13 [1] CRAN (R 4.2.1)
#>  tidyr        * 1.2.0      2022-02-01 [1] CRAN (R 4.2.0)
#>  tidyselect     1.1.2      2022-02-21 [1] CRAN (R 4.2.0)
#>  timeDate       4021.104   2022-07-19 [1] CRAN (R 4.2.1)
#>  tune         * 1.0.0      2022-07-07 [1] CRAN (R 4.2.1)
#>  usemodels    * 0.2.0      2022-02-18 [1] CRAN (R 4.2.1)
#>  utf8           1.2.2      2021-07-24 [1] CRAN (R 4.2.0)
#>  vctrs          0.4.1      2022-04-13 [1] CRAN (R 4.2.0)
#>  withr          2.5.0      2022-03-03 [1] CRAN (R 4.2.0)
#>  workflows    * 1.0.0      2022-07-05 [1] CRAN (R 4.2.1)
#>  workflowsets * 1.0.0      2022-07-12 [1] CRAN (R 4.2.1)
#>  xfun           0.32       2022-08-10 [1] CRAN (R 4.2.1)
#>  yaml           2.3.5      2022-02-21 [1] CRAN (R 4.2.0)
#>  yardstick    * 1.0.0      2022-06-06 [1] CRAN (R 4.2.0)
#> 
#>  [1] C:/Users/Daniel.AK-HAMBURG/AppData/Local/R/win-library/4.2
#>  [2] C:/Program Files/R/R-4.2.1/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

use a prefix argument for objects

default to the engine name (e.g. prefix = "glmnet")

[idea] return a list + print method

Hello @topepo

Imagine if foo <- usemodels::use_xgboost(mpg~., mtcars) would return a list of code chunks, like

foo <- list();
foo$recipe <- "xgb_recipe <- 
  recipe(formula = mpg ~ ., data = mtcars) %>% 
  step_zv(all_predictors()) "
foo$recipe
#> [1] "xgb_recipe <- \n  recipe(formula = mpg ~ ., data = mtcars) %>% \n  step_zv(all_predictors()) "
cat(foo$recipe)
#> xgb_recipe <- 
#>   recipe(formula = mpg ~ ., data = mtcars) %>% 
#>   step_zv(all_predictors())

^{Created on 2020-06-18 by the reprex package (v0.3.0)}

It would suit perfectly for build custom rstudio snippets programatically, like this one by @RobertMyles https://www.robertmylesmcdonnell.com/content/posts/modelscript/

The cat()'s could live inside a print() S3 method...

I would love to contribute with this if it sounds like a good idea.

code for glmnet_tune does not work

Tips for a helpful bug report:

  tune_grid(glmnet_workflow, resamples = stop("add your rsample object"), grid = glmnet_grid)```
 this code was said to be working after copied to source in R-Studio. But I get an error message:
"       check_rset(resamples) : add your rsample object" 
I don`t know how to solve this as the previous code did not generate an error message.
I am working with a MacBook Air M1 OSx 12.4 and R- Studio: 2022.02.3 Build 492

handle missing data in palmerpenguins glmnet example

The problem

When trying to use the example code created by use_glmnet in the package documentation it fails due to missing data. It might make sense to generate a code which can handle missing data or have another example in the package documentation which works out of the box. With adding a recipe step step_unknown the code runs if I filter numeric missing data.

<title>reprex_reprex.R</title> <script src="libs/header-attrs-2.10/header-attrs.js"></script> <script src="libs/jquery-3.6.0/jquery-3.6.0.min.js"></script> <script src="libs/anchor-sections-1.0.1/anchor-sections.js"></script> <style type="text/css"> pre > code.sourceCode { white-space: pre; position: relative; } pre > code.sourceCode > span { display: inline-block; line-height: 1.25; } pre > code.sourceCode > span:empty { height: 1.2em; } .sourceCode { overflow: visible; } code.sourceCode > span { color: inherit; text-decoration: inherit; } pre.sourceCode { margin: 0; } @media screen { div.sourceCode { overflow: auto; } } @media print { pre > code.sourceCode { white-space: pre-wrap; } pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } } pre.numberSource code { counter-reset: source-line 0; } pre.numberSource code > span { position: relative; left: -4em; counter-increment: source-line; } pre.numberSource code > span > a:first-child::before { content: counter(source-line); position: relative; left: -1em; text-align: right; vertical-align: baseline; border: none; display: inline-block; -webkit-touch-callout: none; -webkit-user-select: none; -khtml-user-select: none; -moz-user-select: none; -ms-user-select: none; user-select: none; padding: 0 4px; width: 4em; color: #aaaaaa; } pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; } div.sourceCode { } @media screen { pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; } } code span.al { color: #ff0000; font-weight: bold; } /* Alert */ code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code span.at { color: #7d9029; } /* Attribute */ code span.bn { color: #40a070; } /* BaseN */ code span.bu { } /* BuiltIn */ code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ code span.ch { color: #4070a0; } /* Char */ code span.cn { color: #880000; } /* Constant */ code span.co { color: #60a0b0; font-style: italic; } /* Comment */ code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code span.do { color: #ba2121; font-style: italic; } /* Documentation */ code span.dt { color: #902000; } /* DataType */ code span.dv { color: #40a070; } /* DecVal */ code span.er { color: #ff0000; font-weight: bold; } /* Error */ code span.ex { } /* Extension */ code span.fl { color: #40a070; } /* Float */ code span.fu { color: #06287e; } /* Function */ code span.im { } /* Import */ code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ code span.kw { color: #007020; font-weight: bold; } /* Keyword */ code span.op { color: #666666; } /* Operator */ code span.ot { color: #007020; } /* Other */ code span.pp { color: #bc7a00; } /* Preprocessor */ code span.sc { color: #4070a0; } /* SpecialChar */ code span.ss { color: #bb6688; } /* SpecialString */ code span.st { color: #4070a0; } /* String */ code span.va { color: #19177c; } /* Variable */ code span.vs { color: #4070a0; } /* VerbatimString */ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ </style>

<div class="book-summary">
  <nav role="navigation">


  </nav>
</div>

<div class="book-body">
  <div class="body-inner">
    <div class="book-header" role="navigation">
      <h1>
        <i class="fa fa-circle-o-notch fa-spin"></i><a href="./">reprex_reprex.R</a>
      </h1>
    </div>

    <div class="page-wrapper" tabindex="-1" role="main">
      <div class="page-inner">

        <section class="normal" id="section-">

reprex_reprex.R

ildi

2021-08-15

library(tidymodels)

## Registered S3 method overwritten by 'tune':
##   method                   from   
##   required_pkgs.model_spec parsnip

## ── Attaching packages ────────────────────────────────────── tidymodels 0.1.3 ──

## ✓ broom        0.7.9      ✓ recipes      0.1.16
## ✓ dials        0.0.9      ✓ rsample      0.1.0 
## ✓ dplyr        1.0.7      ✓ tibble       3.1.3 
## ✓ ggplot2      3.3.5      ✓ tidyr        1.1.3 
## ✓ infer        1.0.0      ✓ tune         0.1.6 
## ✓ modeldata    0.1.1      ✓ workflows    0.2.3 
## ✓ parsnip      0.1.7      ✓ workflowsets 0.1.0 
## ✓ purrr        0.3.4      ✓ yardstick    0.0.8

## ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
## x purrr::discard() masks scales::discard()
## x dplyr::filter()  masks stats::filter()
## x dplyr::lag()     masks stats::lag()
## x recipes::step()  masks stats::step()
## • Use tidymodels_prefer() to resolve common conflicts.

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ✓ stringr 1.4.0

## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x readr::col_factor() masks scales::col_factor()
## x purrr::discard()    masks scales::discard()
## x dplyr::filter()     masks stats::filter()
## x stringr::fixed()    masks recipes::fixed()
## x dplyr::lag()        masks stats::lag()
## x readr::spec()       masks yardstick::spec()

library(palmerpenguins)

set.seed(99)

penguins_wo_missing_numeric <- penguins %>%
  filter(across(where(is.numeric), ~!is.na(.x)))

penguin_split <- initial_split(penguins_wo_missing_numeric, prop = 0.8)
penguin_folds <- vfold_cv(training(penguin_split), v = 5)

usemodels::use_glmnet(
  species ~ .,
  data = training(penguin_split),
  verbose = FALSE,
  tune = TRUE,
  colors = TRUE
)

## glmnet_recipe <- 
##   recipe(formula = species ~ ., data = training(penguin_split)) %>% 
##   step_novel(all_nominal(), -all_outcomes()) %>% 
##   step_dummy(all_nominal(), -all_outcomes()) %>% 
##   step_zv(all_predictors()) %>% 
##   step_normalize(all_predictors(), -all_nominal()) 
## 
## glmnet_spec <- 
##   multinom_reg(penalty = tune(), mixture = tune()) %>% 
##   set_mode("classification") %>% 
##   set_engine("glmnet") 
## 
## glmnet_workflow <- 
##   workflow() %>% 
##   add_recipe(glmnet_recipe) %>% 
##   add_model(glmnet_spec) 
## 
## glmnet_grid <- tidyr::crossing(penalty = 10^seq(-6, -1, length.out = 20), mixture = c(0.05, 
##     0.2, 0.4, 0.6, 0.8, 1)) 
## 
## glmnet_tune <- 
##   tune_grid(glmnet_workflow, resamples = stop("add your rsample object"), grid = glmnet_grid)

glmnet_recipe <-
  recipe(formula = species ~ ., data = training(penguin_split)) %>%
  step_novel(all_nominal(), -all_outcomes()) %>%
  step_dummy(all_nominal(), -all_outcomes()) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_predictors(), -all_nominal())

glmnet_spec <-
  multinom_reg(penalty = tune(), mixture = tune()) %>%
  set_mode("classification") %>%
  set_engine("glmnet")

glmnet_workflow <-
  workflow() %>%
  add_recipe(glmnet_recipe) %>%
  add_model(glmnet_spec)

glmnet_grid <- tidyr::crossing(
  penalty = 10^seq(-6, -1, length.out = 3),
  mixture = c(0.05, 0.6)
)

tune_grid(glmnet_workflow, resamples = penguin_folds, grid = glmnet_grid)

## ! Fold1: preprocessor 1/1: There are new levels in a factor: NA

## x Fold1: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## x Fold1: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## ! Fold2: preprocessor 1/1: There are new levels in a factor: NA

## x Fold2: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## x Fold2: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## ! Fold3: preprocessor 1/1: There are new levels in a factor: NA

## x Fold3: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## x Fold3: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## ! Fold4: preprocessor 1/1: There are new levels in a factor: NA

## x Fold4: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## x Fold4: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## ! Fold5: preprocessor 1/1: There are new levels in a factor: NA

## x Fold5: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## x Fold5: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...

## Warning: All models failed. See the `.notes` column.

## Warning: This tuning result has notes. Example notes on model fitting include:
## preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 5)
## preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 5)
## preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 5)

## # Tuning results
## # 5-fold cross-validation 
## # A tibble: 5 × 4
##   splits           id    .metrics .notes          
##   <list>           <chr> <list>   <list>          
## 1 <split [218/55]> Fold1 <NULL>   <tibble [3 × 1]>
## 2 <split [218/55]> Fold2 <NULL>   <tibble [3 × 1]>
## 3 <split [218/55]> Fold3 <NULL>   <tibble [3 × 1]>
## 4 <split [219/54]> Fold4 <NULL>   <tibble [3 × 1]>
## 5 <split [219/54]> Fold5 <NULL>   <tibble [3 × 1]>

# with step_unknown
glmnet_recipe <-
  recipe(formula = species ~ ., data = training(penguin_split)) %>%
  step_unknown(all_nominal(), -all_outcomes()) %>%
  step_novel(all_nominal(), -all_outcomes()) %>%
  step_dummy(all_nominal(), -all_outcomes()) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_predictors(), -all_nominal())

glmnet_workflow <-
  glmnet_workflow %>%
  update_recipe(glmnet_recipe)

tune_grid(glmnet_workflow, resamples = penguin_folds, grid = glmnet_grid)

## # Tuning results
## # 5-fold cross-validation 
## # A tibble: 5 × 4
##   splits           id    .metrics          .notes          
##   <list>           <chr> <list>            <list>          
## 1 <split [218/55]> Fold1 <tibble [12 × 6]> <tibble [0 × 1]>
## 2 <split [218/55]> Fold2 <tibble [12 × 6]> <tibble [0 × 1]>
## 3 <split [218/55]> Fold3 <tibble [12 × 6]> <tibble [0 × 1]>
## 4 <split [219/54]> Fold4 <tibble [12 × 6]> <tibble [0 × 1]>
## 5 <split [219/54]> Fold5 <tibble [12 × 6]> <tibble [0 × 1]>

      </div>
    </div>
  </div>


</div>

Note: I am not sure why reprex gives this output, sorry about that :(

Release usemodels 0.2.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

Upkeep for usemodels

2023

Necessary:

Update copyright holder in DESCRIPTION: person(given = "Posit Software, PBC", role = c("cph", "fnd"))
Double check license file uses '[package] authors' as copyright holder. Run use_mit_license()
Update email addresses *@rstudio.com -> *@posit.co
~~Update logo (https://github.com/rstudio/hex-stickers); run use_tidy_logo()~~
usethis::use_tidy_coc()
usethis::use_tidy_github_actions()

Optional:

Review 2022 checklist to see if you completed the pkgdown updates
Prefer pak::pak("org/pkg") over devtools::install_github("org/pkg") in README
Consider running use_tidy_dependencies() and/or replace compat files with use_standalone()
use_standalone("r-lib/rlang", "types-check") instead of home grown argument checkers
Add alt-text to pictures, plots, etc; see https://posit.co/blog/knitr-fig-alt/ for examples

Kernel Methods

Hello,
Once again a feature request: kernel methods are often the second best choice after neural networks for a variety of tasks, but they rely on a relatively cleaner theoretical background (minimization of a quadratic function).
There is already quite a lot implemented in kernlab and e1071 and they are no stranger to caret, e.g.

https://www.thekerneltrip.com/statistics/kernlab-vs-e1071/

It seems to me the number of tuning parameters is not immense, so perhaps it is possible to include these algorithms into usemodels?
Many thanks!

Upkeep for usemodels

Pre-history

usethis::use_readme_rmd()
usethis::use_roxygen_md()
usethis::use_github_links()
usethis::use_pkgdown_github_pages()
usethis::use_tidy_github_labels()
usethis::use_tidy_style()
usethis::use_tidy_description()
urlchecker::url_check()

2020

usethis::use_package_doc()
Consider letting usethis manage your @importFrom directives here.
usethis::use_import_from() is handy for this.
usethis::use_testthat(3) and upgrade to 3e, testthat 3e vignette
Align the names of R/ files and test/ files for workflow happiness.
usethis::rename_files() can be helpful.

2021

usethis::use_tidy_dependencies()
usethis::use_tidy_github_actions() and update artisanal actions to use setup-r-dependencies
Remove check environments section from cran-comments.md
Bump required R version in DESCRIPTION to 3.4
Use lifecycle instead of artisanal deprecation messages, as described in Communicate lifecycle changes in your functions
Add RStudio to DESCRIPTION as funder, if appropriate

2022

usethis::use_tidy_coc()
Update errors to rlang 1.0.0. Helpful guides:
https://rlang.r-lib.org/reference/topic-error-call.html
https://rlang.r-lib.org/reference/topic-error-chaining.html
https://rlang.r-lib.org/reference/topic-condition-formatting.html
Update pkgdown site using instructions at https://tidytemplate.tidyverse.org
Re-publish released site using r-lib/pkgdown#2051
Ensure pkgdown development is mode: auto in pkgdown config
Handle and close any still-open master --> main issues
Update README badges, instructions in r-lib/usethis#1594

plot a single xgboost trees in tidymodels

Move `master` branch to `main`

The master branch of this repository will soon be renamed to main, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.

That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master --> main change.

The purpose of this issue is to:

Help us firm up the list of targetted repositories
Make sure all maintainers are aware of what's coming
Give us an issue to close when the job is done
Give us a place to put advice for collaborators re: how to adapt

message id: euphoric_snowdog

Release usemodels 0.1.0

Prepare for release:

Submit to CRAN:

usethis::use_version('minor')
devtools::submit_cran()
Approve email

Wait for CRAN...

How about adding usemodels_addin() to make it more tied to workflowsets?

This is a note about features I would like to see added in the future.

I listened to Max Kuhn's talk at LA R Users.
https://www.youtube.com/watch?v=2OfTEakSFXQ

At around 35:00 of this talk, he was trying to create multiple models with parsnip_addin() in order to include multiple models in workflowsets.

usemodels will output both recipes and models.
I was commenting that it would be ideal to have a function like parsnip_addin() where we can use shiny to make a visual selection and then the recipe and model are output.

I can't help you technically, but if you find the above useful in the future, I'd be happy to have it on the implementation list.

P.S.
Do you have any recommended tutorials to help with the technical implementation?

Finally.
Many thanks to the developers of tidymodels.

Wrap variables in one_of() in quotes

The problem

The vars that go in the one_of() selector should perhaps be quoted. Currently, if a user copies and pastes the code they will get: Error: object '{var_name}' not found

Reproducible example

library(dplyr)
library(ggplot2)
library(recipes)

usemodels::use_glmnet(cty ~ ., data = mpg)
#> glmnet_recipe <- 
#>   recipe(formula = cty ~ ., data = mpg) %>% 
#>   step_string2factor(one_of(manufacturer, model, trans, drv, fl, class)) %>% 
#>   step_novel(all_nominal(), -all_outcomes()) %>% 
#>   step_dummy(all_nominal(), -all_outcomes()) %>% 
#>   step_zv(all_predictors()) %>% 
#>   step_normalize(all_predictors(), -all_nominal()) 
#> 
#> glmnet_spec <- 
#>   linear_reg(penalty = tune(), mixture = tune()) %>% 
#>   set_mode("regression") %>% 
#>   set_engine("glmnet") 
#> 
#> glmnet_workflow <- 
#>   workflow() %>% 
#>   add_recipe(glmnet_recipe) %>% 
#>   add_model(glmnet_spec) 
#> 
#> glmnet_grid <- tidyr::crossing(penalty = 10^seq(-6, -1, length.out = 20), mixture = c(0.05, 
#>     0.2, 0.4, 0.6, 0.8, 1)) 
#> 
#> glmnet_tune <- 
#>   tune_grid(glmnet_workflow, resamples = stop("add your rsample object"), grid = glmnet_grid)

glmnet_recipe <- 
  recipe(formula = cty ~ ., data = mpg) %>% 
  step_string2factor(one_of(manufacturer, model, trans, drv, fl, class)) %>% 
  step_novel(all_nominal(), -all_outcomes()) %>% 
  step_dummy(all_nominal(), -all_outcomes()) %>% 
  step_zv(all_predictors()) %>% 
  step_normalize(all_predictors(), -all_nominal()) 

prep(glmnet_recipe, mpg) %>% juice()
#> Error: object 'manufacturer' not found

devtools::session_info()
#> Warning in system("timedatectl", intern = TRUE): running command 'timedatectl'
#> had status 1
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       Ubuntu 16.04.6 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  C.UTF-8                     
#>  ctype    C.UTF-8                     
#>  tz       Etc/UTC                     
#>  date     2020-10-12                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source        
#>  assertthat    0.2.1      2019-03-21 [1] RSPM (R 4.0.2)
#>  backports     1.1.10     2020-09-15 [1] RSPM (R 4.0.2)
#>  callr         3.4.4      2020-09-07 [1] RSPM (R 4.0.2)
#>  class         7.3-17     2020-04-26 [2] CRAN (R 4.0.2)
#>  cli           2.0.2      2020-02-28 [1] RSPM (R 4.0.2)
#>  codetools     0.2-16     2018-12-24 [2] CRAN (R 4.0.2)
#>  colorspace    1.4-1      2019-03-18 [1] RSPM (R 4.0.2)
#>  crayon        1.3.4      2017-09-16 [1] RSPM (R 4.0.2)
#>  desc          1.2.0      2018-05-01 [1] RSPM (R 4.0.2)
#>  devtools      2.3.2      2020-09-18 [1] RSPM (R 4.0.2)
#>  dials         0.0.9      2020-09-16 [1] RSPM (R 4.0.2)
#>  DiceDesign    1.8-1      2019-07-31 [1] RSPM (R 4.0.2)
#>  digest        0.6.25     2020-02-23 [1] RSPM (R 4.0.2)
#>  dplyr       * 1.0.2      2020-08-18 [1] RSPM (R 4.0.2)
#>  ellipsis      0.3.1      2020-05-15 [1] RSPM (R 4.0.2)
#>  evaluate      0.14       2019-05-28 [1] RSPM (R 4.0.2)
#>  fansi         0.4.1      2020-01-08 [1] RSPM (R 4.0.2)
#>  foreach       1.5.0      2020-03-30 [1] RSPM (R 4.0.2)
#>  fs            1.5.0      2020-07-31 [1] RSPM (R 4.0.2)
#>  generics      0.0.2      2018-11-29 [1] RSPM (R 4.0.2)
#>  ggplot2     * 3.3.2      2020-06-19 [1] RSPM (R 4.0.2)
#>  glue          1.4.2      2020-08-27 [1] RSPM (R 4.0.2)
#>  gower         0.2.2      2020-06-23 [1] RSPM (R 4.0.2)
#>  GPfit         1.0-8      2019-02-08 [1] RSPM (R 4.0.2)
#>  gtable        0.3.0      2019-03-25 [1] RSPM (R 4.0.2)
#>  highr         0.8        2019-03-20 [1] RSPM (R 4.0.2)
#>  htmltools     0.5.0      2020-06-16 [1] RSPM (R 4.0.2)
#>  ipred         0.9-9      2019-04-28 [1] RSPM (R 4.0.2)
#>  iterators     1.0.12     2019-07-26 [1] RSPM (R 4.0.2)
#>  knitr         1.30       2020-09-22 [1] RSPM (R 4.0.2)
#>  lattice       0.20-41    2020-04-02 [2] CRAN (R 4.0.2)
#>  lava          1.6.8      2020-09-26 [1] RSPM (R 4.0.2)
#>  lhs           1.1.1      2020-10-05 [1] RSPM (R 4.0.2)
#>  lifecycle     0.2.0      2020-03-06 [1] RSPM (R 4.0.2)
#>  lubridate     1.7.9      2020-06-08 [1] RSPM (R 4.0.2)
#>  magrittr      1.5        2014-11-22 [1] RSPM (R 4.0.2)
#>  MASS          7.3-51.6   2020-04-26 [2] CRAN (R 4.0.2)
#>  Matrix        1.2-18     2019-11-27 [2] CRAN (R 4.0.2)
#>  memoise       1.1.0      2017-04-21 [1] RSPM (R 4.0.2)
#>  munsell       0.5.0      2018-06-12 [1] RSPM (R 4.0.2)
#>  nnet          7.3-14     2020-04-26 [2] CRAN (R 4.0.2)
#>  parsnip       0.1.3      2020-08-04 [1] RSPM (R 4.0.2)
#>  pillar        1.4.6      2020-07-10 [1] RSPM (R 4.0.2)
#>  pkgbuild      1.1.0      2020-07-13 [1] RSPM (R 4.0.2)
#>  pkgconfig     2.0.3      2019-09-22 [1] RSPM (R 4.0.2)
#>  pkgload       1.1.0      2020-05-29 [1] RSPM (R 4.0.2)
#>  plyr          1.8.6      2020-03-03 [1] RSPM (R 4.0.2)
#>  prettyunits   1.1.1      2020-01-24 [1] RSPM (R 4.0.2)
#>  pROC          1.16.2     2020-03-19 [1] RSPM (R 4.0.2)
#>  processx      3.4.4      2020-09-03 [1] RSPM (R 4.0.2)
#>  prodlim       2019.11.13 2019-11-17 [1] RSPM (R 4.0.2)
#>  ps            1.3.4      2020-08-11 [1] RSPM (R 4.0.2)
#>  purrr         0.3.4      2020-04-17 [1] RSPM (R 4.0.2)
#>  R6            2.4.1      2019-11-12 [1] RSPM (R 4.0.2)
#>  Rcpp          1.0.5      2020-07-06 [1] RSPM (R 4.0.2)
#>  recipes     * 0.1.13     2020-06-23 [1] RSPM (R 4.0.2)
#>  remotes       2.2.0      2020-07-21 [1] RSPM (R 4.0.2)
#>  rlang         0.4.7      2020-07-09 [1] RSPM (R 4.0.2)
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 4.0.2)
#>  rpart         4.1-15     2019-04-12 [2] CRAN (R 4.0.2)
#>  rprojroot     1.3-2      2018-01-03 [1] RSPM (R 4.0.2)
#>  scales        1.1.1      2020-05-11 [1] RSPM (R 4.0.2)
#>  sessioninfo   1.1.1      2018-11-05 [1] RSPM (R 4.0.2)
#>  stringi       1.5.3      2020-09-09 [1] RSPM (R 4.0.2)
#>  stringr       1.4.0      2019-02-10 [1] RSPM (R 4.0.2)
#>  survival      3.1-12     2020-04-10 [2] CRAN (R 4.0.2)
#>  testthat      2.3.2      2020-03-02 [1] RSPM (R 4.0.2)
#>  tibble        3.0.3      2020-07-10 [1] RSPM (R 4.0.2)
#>  tidyr         1.1.2      2020-08-27 [1] RSPM (R 4.0.2)
#>  tidyselect    1.1.0      2020-05-11 [1] RSPM (R 4.0.2)
#>  timeDate      3043.102   2018-02-21 [1] RSPM (R 4.0.2)
#>  tune          0.1.1      2020-07-08 [1] RSPM (R 4.0.2)
#>  usemodels     0.0.1      2020-09-22 [1] RSPM (R 4.0.2)
#>  usethis       1.6.3      2020-09-17 [1] RSPM (R 4.0.2)
#>  vctrs         0.3.4      2020-08-29 [1] RSPM (R 4.0.2)
#>  withr         2.3.0      2020-09-22 [1] RSPM (R 4.0.2)
#>  workflows     0.2.1      2020-10-08 [1] RSPM (R 4.0.2)
#>  xfun          0.18       2020-09-29 [1] CRAN (R 4.0.2)
#>  yaml          2.2.1      2020-02-01 [1] RSPM (R 4.0.2)
#>  yardstick     0.0.7      2020-07-13 [1] RSPM (R 4.0.2)
#> 
#> [1] /home/rstudio-user/R/x86_64-pc-linux-gnu-library/4.0
#> [2] /opt/R/4.0.2/lib/R/library

^{Created on 2020-10-12 by the reprex package (v0.3.0)}

Cubist and C50

First things first: this is really great work.
I have not checked up this package in a bit, but if not included already, I would love to see support for Cubist and C50.

tidymodels / usemodels Goto Github PK

usemodels's Introduction

usemodels

Installation

Contributing

usemodels's People

Contributors

Stargazers

Watchers

Forkers

usemodels's Issues

Clipboard and/or RStudio API support

Tips for a helpful bug report:

The problem

reprex_reprex.R

The problem

Reproducible example

Recommend Projects

Recommend Topics

Recommend Org