Giter Site home page Giter Site logo

usemodels's Introduction

usemodels

R-CMD-check Coverage status lifecycle

The usemodels package is a helpful way of quickly creating code snippets to fit models using the tidymodels framework.

Given a simple formula and a data set, the use_* functions can create code that appropriate for the data (given the model).

For example, using the palmerpenguins data with a glmnet model:

> library(usemodels)
> library(palmerpenguins)
> data(penguins)
> use_glmnet(body_mass_g ~ ., data = penguins)
glmnet_recipe <- 
  recipe(formula = body_mass_g ~ ., data = penguins) %>% 
  step_novel(all_nominal_predictors()) %>% 
  step_dummy(all_nominal_predictors()) %>% 
  step_zv(all_predictors()) %>% 
  step_normalize(all_numeric_predictors()) 

glmnet_spec <- 
  linear_reg(penalty = tune(), mixture = tune()) %>% 
  set_mode("regression") %>% 
  set_engine("glmnet") 

glmnet_workflow <- 
  workflow() %>% 
  add_recipe(glmnet_recipe) %>% 
  add_model(glmnet_spec) 

glmnet_grid <- tidyr::crossing(penalty = 10^seq(-6, -1, length.out = 20), mixture = c(0.05, 
    0.2, 0.4, 0.6, 0.8, 1)) 

glmnet_tune <- 
  tune_grid(glmnet_workflow, resamples = stop("add your rsample object"), grid = glmnet_grid) 

The recipe steps that are used (if any) depend on the type of data as well as the model. In this case, the first two steps handle the fact that Species is a factor-encoded predictor (and glmnet requires all numeric predictors). The last two steps are added because, for this model, the predictors should be on the same scale to be properly regularized.

The package includes these templates:

> ls("package:usemodels", pattern = "use_")
 [1] "use_bag_tree_rpart"   "use_C5.0"             "use_cubist"          
 [4] "use_dbarts"           "use_earth"            "use_glmnet"          
 [7] "use_kernlab_svm_poly" "use_kernlab_svm_rbf"  "use_kknn"            
[10] "use_mgcv"             "use_mixOmics"         "use_nnet"            
[13] "use_ranger"           "use_rpart"            "use_xgboost"         
[16] "use_xrf"             

You can also copy code to the clipboard using the option clipboard = TRUE.

Installation

You can install usemodels with:

devtools::install_github("tidymodels/usemodels")

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

usemodels's People

Contributors

bryceroney avatar emilhvitfeldt avatar hfrick avatar juliasilge avatar qiushiyan avatar topepo avatar vinchinzu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

usemodels's Issues

Add catboost

Hello,
I do not have a snippet to provide, but it would be nice (perhaps as a Xmas present) to have usemodels also include catboost (see https://catboost.ai/). Thanks!

Add use_mulitperceptron?

This is great so far. I'd request use_multiperceptron, if that can be added.

Is the eventual goal to have all of the supported tidymodel types in here? Thanks so much.

update recipes imports

Some methods have been moved form tune to recipes. When usemodels is documented, the warning appears:

Warning message:
replacing previous import ‘recipes::tune_args’ by ‘tune::tune_args’ when loading ‘usemodels’ 

Clipboard and/or RStudio API support

Clipboard and/or RStudio API support

I really like the idea behind this package and was thinking the user experience could be improved again by having the code generated also copied to the clipboard (probably using {clipr}) and/or using {rstudioapi} to insert it into the active document.

I'm not sure if this is an idea that has been canvassed before and rejected, but if there's support I'm happy to take a look at turning my hand to writing it.

`use_xgboost()` uses only 6/8 possible tuning parameters

use_xgboost() only uses 6 of the 8 possible tuning parameters (i.e. mtry and stop_iter are not tune()d).
Is that a deliberate choice (if so, could/should be documented?) or an oversight?
Or am I just missing something?

library(usemodels)
library(tidymodels, warn.conflicts = FALSE)
data(ames)

ames <-
  ames |>
  select(
    Sale_Price,
    Neighborhood,
    Gr_Liv_Area,
    Year_Built,
    Bldg_Type,
    Latitude,
    Longitude
  ) |> 
  mutate(Sale_Price = log10(Sale_Price))

ames_split <- initial_split(ames, prop = 0.80)
ames_train <- training(ames_split)
ames_test  <- testing(ames_split)

use_xgboost(
  Sale_Price ~ Neighborhood + Gr_Liv_Area + Year_Built + Bldg_Type + Latitude + Longitude, 
  data = ames_train
)
#> xgboost_recipe <- 
#>   recipe(formula = Sale_Price ~ Neighborhood + Gr_Liv_Area + Year_Built + Bldg_Type + 
#>     Latitude + Longitude, data = ames_train) %>% 
#>   step_novel(all_nominal_predictors()) %>% 
#>   step_dummy(all_nominal_predictors(), one_hot = TRUE) %>% 
#>   step_zv(all_predictors()) 
#> 
#> xgboost_spec <- 
#>   boost_tree(trees = tune(), min_n = tune(), tree_depth = tune(), learn_rate = tune(), 
#>     loss_reduction = tune(), sample_size = tune()) %>% 
#>   set_mode("regression") %>% 
#>   set_engine("xgboost") 
#> 
#> xgboost_workflow <- 
#>   workflow() %>% 
#>   add_recipe(xgboost_recipe) %>% 
#>   add_model(xgboost_spec) 
#> 
#> set.seed(8291)
#> xgboost_tune <-
#>   tune_grid(xgboost_workflow, resamples = stop("add your rsample object"), grid = stop("add number of candidate points"))
Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23 ucrt)
#>  os       Windows 10 x64 (build 19044)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language en
#>  collate  German_Germany.utf8
#>  ctype    German_Germany.utf8
#>  tz       Europe/Berlin
#>  date     2022-08-12
#>  pandoc   2.18 @ C:/Program Files/RStudio/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date (UTC) lib source
#>  assertthat     0.2.1      2019-03-21 [1] CRAN (R 4.2.0)
#>  backports      1.4.1      2021-12-13 [1] CRAN (R 4.2.0)
#>  broom        * 1.0.0      2022-07-01 [1] CRAN (R 4.2.1)
#>  class          7.3-20     2022-01-16 [2] CRAN (R 4.2.1)
#>  cli            3.3.0      2022-04-25 [1] CRAN (R 4.2.0)
#>  codetools      0.2-18     2020-11-04 [2] CRAN (R 4.2.1)
#>  colorspace     2.0-3      2022-02-21 [1] CRAN (R 4.2.0)
#>  DBI            1.1.3      2022-06-18 [1] CRAN (R 4.2.0)
#>  dials        * 1.0.0      2022-06-14 [1] CRAN (R 4.2.0)
#>  DiceDesign     1.9        2021-02-13 [1] CRAN (R 4.2.0)
#>  digest         0.6.29     2021-12-01 [1] CRAN (R 4.2.0)
#>  dplyr        * 1.0.9      2022-04-28 [1] CRAN (R 4.2.0)
#>  ellipsis       0.3.2      2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate       0.16       2022-08-09 [1] CRAN (R 4.2.1)
#>  fansi          1.0.3      2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap        1.1.0      2021-01-25 [1] CRAN (R 4.2.0)
#>  foreach        1.5.2      2022-02-02 [1] CRAN (R 4.2.0)
#>  fs             1.5.2      2021-12-08 [1] CRAN (R 4.2.0)
#>  furrr          0.3.0      2022-05-04 [1] CRAN (R 4.2.0)
#>  future         1.27.0     2022-07-22 [1] CRAN (R 4.2.1)
#>  future.apply   1.9.0      2022-04-25 [1] CRAN (R 4.2.0)
#>  generics       0.1.3      2022-07-05 [1] CRAN (R 4.2.1)
#>  ggplot2      * 3.3.6      2022-05-03 [1] CRAN (R 4.2.0)
#>  globals        0.16.0     2022-08-05 [1] CRAN (R 4.2.1)
#>  glue           1.6.2      2022-02-24 [1] CRAN (R 4.2.0)
#>  gower          1.0.0      2022-02-03 [1] CRAN (R 4.2.0)
#>  GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.2.0)
#>  gtable         0.3.0      2019-03-25 [1] CRAN (R 4.2.0)
#>  hardhat        1.2.0      2022-06-30 [1] CRAN (R 4.2.1)
#>  highr          0.9        2021-04-16 [1] CRAN (R 4.2.0)
#>  htmltools      0.5.3      2022-07-18 [1] CRAN (R 4.2.1)
#>  infer        * 1.0.2      2022-06-10 [1] CRAN (R 4.2.0)
#>  ipred          0.9-13     2022-06-02 [1] CRAN (R 4.2.0)
#>  iterators      1.0.14     2022-02-05 [1] CRAN (R 4.2.0)
#>  knitr          1.39       2022-04-26 [1] CRAN (R 4.2.0)
#>  lattice        0.20-45    2021-09-22 [2] CRAN (R 4.2.1)
#>  lava           1.6.10     2021-09-02 [1] CRAN (R 4.2.0)
#>  lhs            1.1.5      2022-03-22 [1] CRAN (R 4.2.0)
#>  lifecycle      1.0.1      2021-09-24 [1] CRAN (R 4.2.0)
#>  listenv        0.8.0      2019-12-05 [1] CRAN (R 4.2.0)
#>  lubridate      1.8.0      2021-10-07 [1] CRAN (R 4.2.0)
#>  magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.2.0)
#>  MASS           7.3-58.1   2022-08-03 [1] CRAN (R 4.2.1)
#>  Matrix         1.4-1      2022-03-23 [2] CRAN (R 4.2.1)
#>  modeldata    * 1.0.0      2022-07-01 [1] CRAN (R 4.2.1)
#>  munsell        0.5.0      2018-06-12 [1] CRAN (R 4.2.0)
#>  nnet           7.3-17     2022-01-16 [2] CRAN (R 4.2.1)
#>  parallelly     1.32.1     2022-07-21 [1] CRAN (R 4.2.1)
#>  parsnip      * 1.0.0      2022-06-16 [1] CRAN (R 4.2.0)
#>  pillar         1.8.0      2022-07-18 [1] CRAN (R 4.2.1)
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.2.0)
#>  prodlim        2019.11.13 2019-11-17 [1] CRAN (R 4.2.0)
#>  purrr        * 0.3.4      2020-04-17 [1] CRAN (R 4.2.0)
#>  R.cache        0.16.0     2022-07-21 [1] CRAN (R 4.2.1)
#>  R.methodsS3    1.8.2      2022-06-13 [1] CRAN (R 4.2.0)
#>  R.oo           1.25.0     2022-06-12 [1] CRAN (R 4.2.0)
#>  R.utils        2.12.0     2022-06-28 [1] CRAN (R 4.2.1)
#>  R6             2.5.1      2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp           1.0.9      2022-07-08 [1] CRAN (R 4.2.1)
#>  recipes      * 1.0.1      2022-07-07 [1] CRAN (R 4.2.1)
#>  reprex         2.0.1      2021-08-05 [1] CRAN (R 4.2.0)
#>  rlang          1.0.4      2022-07-12 [1] CRAN (R 4.2.1)
#>  rmarkdown      2.14       2022-04-25 [1] CRAN (R 4.2.0)
#>  rpart          4.1.16     2022-01-24 [2] CRAN (R 4.2.1)
#>  rsample      * 1.1.0      2022-08-08 [1] CRAN (R 4.2.1)
#>  rstudioapi     0.13       2020-11-12 [1] CRAN (R 4.2.0)
#>  scales       * 1.2.0      2022-04-13 [1] CRAN (R 4.2.0)
#>  sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.2.0)
#>  stringi        1.7.8      2022-07-11 [1] CRAN (R 4.2.1)
#>  stringr        1.4.0      2019-02-10 [1] CRAN (R 4.2.0)
#>  styler         1.7.0      2022-03-13 [1] CRAN (R 4.2.0)
#>  survival       3.4-0      2022-08-09 [1] CRAN (R 4.2.1)
#>  tibble       * 3.1.8      2022-07-22 [1] CRAN (R 4.2.1)
#>  tidymodels   * 1.0.0      2022-07-13 [1] CRAN (R 4.2.1)
#>  tidyr        * 1.2.0      2022-02-01 [1] CRAN (R 4.2.0)
#>  tidyselect     1.1.2      2022-02-21 [1] CRAN (R 4.2.0)
#>  timeDate       4021.104   2022-07-19 [1] CRAN (R 4.2.1)
#>  tune         * 1.0.0      2022-07-07 [1] CRAN (R 4.2.1)
#>  usemodels    * 0.2.0      2022-02-18 [1] CRAN (R 4.2.1)
#>  utf8           1.2.2      2021-07-24 [1] CRAN (R 4.2.0)
#>  vctrs          0.4.1      2022-04-13 [1] CRAN (R 4.2.0)
#>  withr          2.5.0      2022-03-03 [1] CRAN (R 4.2.0)
#>  workflows    * 1.0.0      2022-07-05 [1] CRAN (R 4.2.1)
#>  workflowsets * 1.0.0      2022-07-12 [1] CRAN (R 4.2.1)
#>  xfun           0.32       2022-08-10 [1] CRAN (R 4.2.1)
#>  yaml           2.3.5      2022-02-21 [1] CRAN (R 4.2.0)
#>  yardstick    * 1.0.0      2022-06-06 [1] CRAN (R 4.2.0)
#> 
#>  [1] C:/Users/Daniel.AK-HAMBURG/AppData/Local/R/win-library/4.2
#>  [2] C:/Program Files/R/R-4.2.1/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

[idea] return a list + print method

Hello @topepo

Imagine if foo <- usemodels::use_xgboost(mpg~., mtcars) would return a list of code chunks, like

foo <- list();
foo$recipe <- "xgb_recipe <- 
  recipe(formula = mpg ~ ., data = mtcars) %>% 
  step_zv(all_predictors()) "
foo$recipe
#> [1] "xgb_recipe <- \n  recipe(formula = mpg ~ ., data = mtcars) %>% \n  step_zv(all_predictors()) "
cat(foo$recipe)
#> xgb_recipe <- 
#>   recipe(formula = mpg ~ ., data = mtcars) %>% 
#>   step_zv(all_predictors())

Created on 2020-06-18 by the reprex package (v0.3.0)

It would suit perfectly for build custom rstudio snippets programatically, like this one by @RobertMyles https://www.robertmylesmcdonnell.com/content/posts/modelscript/

The cat()'s could live inside a print() S3 method...

I would love to contribute with this if it sounds like a good idea.

code for glmnet_tune does not work

Tips for a helpful bug report:

  tune_grid(glmnet_workflow, resamples = stop("add your rsample object"), grid = glmnet_grid)```
 this code was said to be working after copied to source in R-Studio. But I get an error message:
"       check_rset(resamples) : add your rsample object" 
I don`t know how to solve this as the previous code did not generate an error message.
I am working with a MacBook Air M1 OSx 12.4 and R- Studio: 2022.02.3 Build 492

handle missing data in palmerpenguins glmnet example

The problem

When trying to use the example code created by use_glmnet in the package documentation it fails due to missing data. It might make sense to generate a code which can handle missing data or have another example in the package documentation which works out of the box. With adding a recipe step step_unknown the code runs if I filter numeric missing data.

<title>reprex_reprex.R</title> <script src="libs/header-attrs-2.10/header-attrs.js"></script> <script src="libs/jquery-3.6.0/jquery-3.6.0.min.js"></script> <script src="libs/anchor-sections-1.0.1/anchor-sections.js"></script> <style type="text/css"> pre > code.sourceCode { white-space: pre; position: relative; } pre > code.sourceCode > span { display: inline-block; line-height: 1.25; } pre > code.sourceCode > span:empty { height: 1.2em; } .sourceCode { overflow: visible; } code.sourceCode > span { color: inherit; text-decoration: inherit; } pre.sourceCode { margin: 0; } @media screen { div.sourceCode { overflow: auto; } } @media print { pre > code.sourceCode { white-space: pre-wrap; } pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; } } pre.numberSource code { counter-reset: source-line 0; } pre.numberSource code > span { position: relative; left: -4em; counter-increment: source-line; } pre.numberSource code > span > a:first-child::before { content: counter(source-line); position: relative; left: -1em; text-align: right; vertical-align: baseline; border: none; display: inline-block; -webkit-touch-callout: none; -webkit-user-select: none; -khtml-user-select: none; -moz-user-select: none; -ms-user-select: none; user-select: none; padding: 0 4px; width: 4em; color: #aaaaaa; } pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; } div.sourceCode { } @media screen { pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; } } code span.al { color: #ff0000; font-weight: bold; } /* Alert */ code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */ code span.at { color: #7d9029; } /* Attribute */ code span.bn { color: #40a070; } /* BaseN */ code span.bu { } /* BuiltIn */ code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */ code span.ch { color: #4070a0; } /* Char */ code span.cn { color: #880000; } /* Constant */ code span.co { color: #60a0b0; font-style: italic; } /* Comment */ code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */ code span.do { color: #ba2121; font-style: italic; } /* Documentation */ code span.dt { color: #902000; } /* DataType */ code span.dv { color: #40a070; } /* DecVal */ code span.er { color: #ff0000; font-weight: bold; } /* Error */ code span.ex { } /* Extension */ code span.fl { color: #40a070; } /* Float */ code span.fu { color: #06287e; } /* Function */ code span.im { } /* Import */ code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */ code span.kw { color: #007020; font-weight: bold; } /* Keyword */ code span.op { color: #666666; } /* Operator */ code span.ot { color: #007020; } /* Other */ code span.pp { color: #bc7a00; } /* Preprocessor */ code span.sc { color: #4070a0; } /* SpecialChar */ code span.ss { color: #bb6688; } /* SpecialString */ code span.st { color: #4070a0; } /* String */ code span.va { color: #19177c; } /* Variable */ code span.vs { color: #4070a0; } /* VerbatimString */ code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */ </style>
<div class="book-summary">
  <nav role="navigation">


  </nav>
</div>

<div class="book-body">
  <div class="body-inner">
    <div class="book-header" role="navigation">
      <h1>
        <i class="fa fa-circle-o-notch fa-spin"></i><a href="./">reprex_reprex.R</a>
      </h1>
    </div>

    <div class="page-wrapper" tabindex="-1" role="main">
      <div class="page-inner">

        <section class="normal" id="section-">

reprex_reprex.R

ildi

2021-08-15

library(tidymodels)
## Registered S3 method overwritten by 'tune':
##   method                   from   
##   required_pkgs.model_spec parsnip
## ── Attaching packages ────────────────────────────────────── tidymodels 0.1.3 ──
## ✓ broom        0.7.9      ✓ recipes      0.1.16
## ✓ dials        0.0.9      ✓ rsample      0.1.0 
## ✓ dplyr        1.0.7      ✓ tibble       3.1.3 
## ✓ ggplot2      3.3.5      ✓ tidyr        1.1.3 
## ✓ infer        1.0.0      ✓ tune         0.1.6 
## ✓ modeldata    0.1.1      ✓ workflows    0.2.3 
## ✓ parsnip      0.1.7      ✓ workflowsets 0.1.0 
## ✓ purrr        0.3.4      ✓ yardstick    0.0.8
## ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
## x purrr::discard() masks scales::discard()
## x dplyr::filter()  masks stats::filter()
## x dplyr::lag()     masks stats::lag()
## x recipes::step()  masks stats::step()
## • Use tidymodels_prefer() to resolve common conflicts.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ readr   2.0.1     ✓ forcats 0.5.1
## ✓ stringr 1.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x readr::col_factor() masks scales::col_factor()
## x purrr::discard()    masks scales::discard()
## x dplyr::filter()     masks stats::filter()
## x stringr::fixed()    masks recipes::fixed()
## x dplyr::lag()        masks stats::lag()
## x readr::spec()       masks yardstick::spec()
library(palmerpenguins)

set.seed(99)

penguins_wo_missing_numeric <- penguins %>%
  filter(across(where(is.numeric), ~!is.na(.x)))

penguin_split <- initial_split(penguins_wo_missing_numeric, prop = 0.8)
penguin_folds <- vfold_cv(training(penguin_split), v = 5)

usemodels::use_glmnet(
  species ~ .,
  data = training(penguin_split),
  verbose = FALSE,
  tune = TRUE,
  colors = TRUE
)
## glmnet_recipe <- 
##   recipe(formula = species ~ ., data = training(penguin_split)) %>% 
##   step_novel(all_nominal(), -all_outcomes()) %>% 
##   step_dummy(all_nominal(), -all_outcomes()) %>% 
##   step_zv(all_predictors()) %>% 
##   step_normalize(all_predictors(), -all_nominal()) 
## 
## glmnet_spec <- 
##   multinom_reg(penalty = tune(), mixture = tune()) %>% 
##   set_mode("classification") %>% 
##   set_engine("glmnet") 
## 
## glmnet_workflow <- 
##   workflow() %>% 
##   add_recipe(glmnet_recipe) %>% 
##   add_model(glmnet_spec) 
## 
## glmnet_grid <- tidyr::crossing(penalty = 10^seq(-6, -1, length.out = 20), mixture = c(0.05, 
##     0.2, 0.4, 0.6, 0.8, 1)) 
## 
## glmnet_tune <- 
##   tune_grid(glmnet_workflow, resamples = stop("add your rsample object"), grid = glmnet_grid)
glmnet_recipe <-
  recipe(formula = species ~ ., data = training(penguin_split)) %>%
  step_novel(all_nominal(), -all_outcomes()) %>%
  step_dummy(all_nominal(), -all_outcomes()) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_predictors(), -all_nominal())

glmnet_spec <-
  multinom_reg(penalty = tune(), mixture = tune()) %>%
  set_mode("classification") %>%
  set_engine("glmnet")

glmnet_workflow <-
  workflow() %>%
  add_recipe(glmnet_recipe) %>%
  add_model(glmnet_spec)

glmnet_grid <- tidyr::crossing(
  penalty = 10^seq(-6, -1, length.out = 3),
  mixture = c(0.05, 0.6)
)

tune_grid(glmnet_workflow, resamples = penguin_folds, grid = glmnet_grid)
## ! Fold1: preprocessor 1/1: There are new levels in a factor: NA
## x Fold1: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## x Fold1: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## ! Fold2: preprocessor 1/1: There are new levels in a factor: NA
## x Fold2: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## x Fold2: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## ! Fold3: preprocessor 1/1: There are new levels in a factor: NA
## x Fold3: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## x Fold3: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## ! Fold4: preprocessor 1/1: There are new levels in a factor: NA
## x Fold4: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## x Fold4: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## ! Fold5: preprocessor 1/1: There are new levels in a factor: NA
## x Fold5: preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## x Fold5: preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, w...
## Warning: All models failed. See the `.notes` column.
## Warning: This tuning result has notes. Example notes on model fitting include:
## preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 5)
## preprocessor 1/1, model 2/2: Error in lognet(xd, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 5)
## preprocessor 1/1, model 1/2: Error in lognet(xd, is.sparse, ix, jx, y, weights, offset, alpha, nobs, : NA/NaN/Inf in foreign function call (arg 5)
## # Tuning results
## # 5-fold cross-validation 
## # A tibble: 5 × 4
##   splits           id    .metrics .notes          
##   <list>           <chr> <list>   <list>          
## 1 <split [218/55]> Fold1 <NULL>   <tibble [3 × 1]>
## 2 <split [218/55]> Fold2 <NULL>   <tibble [3 × 1]>
## 3 <split [218/55]> Fold3 <NULL>   <tibble [3 × 1]>
## 4 <split [219/54]> Fold4 <NULL>   <tibble [3 × 1]>
## 5 <split [219/54]> Fold5 <NULL>   <tibble [3 × 1]>
# with step_unknown
glmnet_recipe <-
  recipe(formula = species ~ ., data = training(penguin_split)) %>%
  step_unknown(all_nominal(), -all_outcomes()) %>%
  step_novel(all_nominal(), -all_outcomes()) %>%
  step_dummy(all_nominal(), -all_outcomes()) %>%
  step_zv(all_predictors()) %>%
  step_normalize(all_predictors(), -all_nominal())

glmnet_workflow <-
  glmnet_workflow %>%
  update_recipe(glmnet_recipe)

tune_grid(glmnet_workflow, resamples = penguin_folds, grid = glmnet_grid)
## # Tuning results
## # 5-fold cross-validation 
## # A tibble: 5 × 4
##   splits           id    .metrics          .notes          
##   <list>           <chr> <list>            <list>          
## 1 <split [218/55]> Fold1 <tibble [12 × 6]> <tibble [0 × 1]>
## 2 <split [218/55]> Fold2 <tibble [12 × 6]> <tibble [0 × 1]>
## 3 <split [218/55]> Fold3 <tibble [12 × 6]> <tibble [0 × 1]>
## 4 <split [219/54]> Fold4 <tibble [12 × 6]> <tibble [0 × 1]>
## 5 <split [219/54]> Fold5 <tibble [12 × 6]> <tibble [0 × 1]>
      </div>
    </div>
  </div>


</div>
<script src="libs/gitbook-2.6.7/js/app.min.js"></script> <script src="libs/gitbook-2.6.7/js/lunr.js"></script> <script src="libs/gitbook-2.6.7/js/clipboard.min.js"></script> <script src="libs/gitbook-2.6.7/js/plugin-search.js"></script> <script src="libs/gitbook-2.6.7/js/plugin-sharing.js"></script> <script src="libs/gitbook-2.6.7/js/plugin-fontsettings.js"></script> <script src="libs/gitbook-2.6.7/js/plugin-bookdown.js"></script> <script src="libs/gitbook-2.6.7/js/jquery.highlight.js"></script> <script src="libs/gitbook-2.6.7/js/plugin-clipboard.js"></script> <script> gitbook.require(["gitbook"], function(gitbook) { gitbook.start({ "sharing": { "github": true, "facebook": false, "twitter": false, "linkedin": false, "weibo": false, "instapaper": false, "vk": false, "whatsapp": false, "all": ["facebook", "twitter", "linkedin", "weibo", "instapaper"] }, "fontsettings": { "theme": "white", "family": "sans", "size": 2 }, "edit": { "link": "https://github.com/r4ds/bookclub-tmwr/edit/main/%s", "text": "Edit" }, "history": { "link": null, "text": null }, "view": { "link": null, "text": null }, "download": null, "search": false, "toc": { "collapse": "section" } }); }); </script>

Note: I am not sure why reprex gives this output, sorry about that :(

Release usemodels 0.2.0

Prepare for release:

  • Check current CRAN check results
  • Polish NEWS
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • revdepcheck::cloud_check()
  • Update cran-comments.md
  • Review pkgdown reference index for, e.g., missing topics
  • Draft blog post
  • Ping Tracy Teal on Slack

Submit to CRAN:

  • usethis::use_version('minor')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

Upkeep for usemodels

2023

Necessary:

  • Update copyright holder in DESCRIPTION: person(given = "Posit Software, PBC", role = c("cph", "fnd"))
  • Double check license file uses '[package] authors' as copyright holder. Run use_mit_license()
  • Update email addresses *@rstudio.com -> *@posit.co
  • Update logo (https://github.com/rstudio/hex-stickers); run use_tidy_logo()
  • usethis::use_tidy_coc()
  • usethis::use_tidy_github_actions()

Optional:

  • Review 2022 checklist to see if you completed the pkgdown updates
  • Prefer pak::pak("org/pkg") over devtools::install_github("org/pkg") in README
  • Consider running use_tidy_dependencies() and/or replace compat files with use_standalone()
  • use_standalone("r-lib/rlang", "types-check") instead of home grown argument checkers
  • Add alt-text to pictures, plots, etc; see https://posit.co/blog/knitr-fig-alt/ for examples

Kernel Methods

Hello,
Once again a feature request: kernel methods are often the second best choice after neural networks for a variety of tasks, but they rely on a relatively cleaner theoretical background (minimization of a quadratic function).
There is already quite a lot implemented in kernlab and e1071 and they are no stranger to caret, e.g.

https://www.thekerneltrip.com/statistics/kernlab-vs-e1071/

It seems to me the number of tuning parameters is not immense, so perhaps it is possible to include these algorithms into usemodels?
Many thanks!

Upkeep for usemodels

Pre-history

  • usethis::use_readme_rmd()
  • usethis::use_roxygen_md()
  • usethis::use_github_links()
  • usethis::use_pkgdown_github_pages()
  • usethis::use_tidy_github_labels()
  • usethis::use_tidy_style()
  • usethis::use_tidy_description()
  • urlchecker::url_check()

2020

  • usethis::use_package_doc()
    Consider letting usethis manage your @importFrom directives here.
    usethis::use_import_from() is handy for this.
  • usethis::use_testthat(3) and upgrade to 3e, testthat 3e vignette
  • Align the names of R/ files and test/ files for workflow happiness.
    usethis::rename_files() can be helpful.

2021

  • usethis::use_tidy_dependencies()
  • usethis::use_tidy_github_actions() and update artisanal actions to use setup-r-dependencies
  • Remove check environments section from cran-comments.md
  • Bump required R version in DESCRIPTION to 3.4
  • Use lifecycle instead of artisanal deprecation messages, as described in Communicate lifecycle changes in your functions
  • Add RStudio to DESCRIPTION as funder, if appropriate

2022

Move `master` branch to `main`

The master branch of this repository will soon be renamed to main, as part of a coordinated change across several GitHub organizations (including, but not limited to: tidyverse, r-lib, tidymodels, and sol-eng). We anticipate this will happen by the end of September 2021.

That will be preceded by a release of the usethis package, which will gain some functionality around detecting and adapting to a renamed default branch. There will also be a blog post at the time of this master --> main change.

The purpose of this issue is to:

  • Help us firm up the list of targetted repositories
  • Make sure all maintainers are aware of what's coming
  • Give us an issue to close when the job is done
  • Give us a place to put advice for collaborators re: how to adapt

message id: euphoric_snowdog

Release usemodels 0.1.0

Prepare for release:

  • devtools::build_readme()
  • Check current CRAN check results
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • revdepcheck::revdep_check(num_workers = 4)
  • Update cran-comments.md
  • Polish NEWS
  • Review pkgdown reference index for, e.g., missing topics
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('minor')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

How about adding usemodels_addin() to make it more tied to workflowsets?

This is a note about features I would like to see added in the future.

I listened to Max Kuhn's talk at LA R Users.
https://www.youtube.com/watch?v=2OfTEakSFXQ

At around 35:00 of this talk, he was trying to create multiple models with parsnip_addin() in order to include multiple models in workflowsets.

usemodels will output both recipes and models.
I was commenting that it would be ideal to have a function like parsnip_addin() where we can use shiny to make a visual selection and then the recipe and model are output.

I can't help you technically, but if you find the above useful in the future, I'd be happy to have it on the implementation list.

P.S.
Do you have any recommended tutorials to help with the technical implementation?

Finally.
Many thanks to the developers of tidymodels.

Wrap variables in one_of() in quotes

The problem

The vars that go in the one_of() selector should perhaps be quoted. Currently, if a user copies and pastes the code they will get: Error: object '{var_name}' not found

Reproducible example

library(dplyr)
library(ggplot2)
library(recipes)

usemodels::use_glmnet(cty ~ ., data = mpg)
#> glmnet_recipe <- 
#>   recipe(formula = cty ~ ., data = mpg) %>% 
#>   step_string2factor(one_of(manufacturer, model, trans, drv, fl, class)) %>% 
#>   step_novel(all_nominal(), -all_outcomes()) %>% 
#>   step_dummy(all_nominal(), -all_outcomes()) %>% 
#>   step_zv(all_predictors()) %>% 
#>   step_normalize(all_predictors(), -all_nominal()) 
#> 
#> glmnet_spec <- 
#>   linear_reg(penalty = tune(), mixture = tune()) %>% 
#>   set_mode("regression") %>% 
#>   set_engine("glmnet") 
#> 
#> glmnet_workflow <- 
#>   workflow() %>% 
#>   add_recipe(glmnet_recipe) %>% 
#>   add_model(glmnet_spec) 
#> 
#> glmnet_grid <- tidyr::crossing(penalty = 10^seq(-6, -1, length.out = 20), mixture = c(0.05, 
#>     0.2, 0.4, 0.6, 0.8, 1)) 
#> 
#> glmnet_tune <- 
#>   tune_grid(glmnet_workflow, resamples = stop("add your rsample object"), grid = glmnet_grid)

glmnet_recipe <- 
  recipe(formula = cty ~ ., data = mpg) %>% 
  step_string2factor(one_of(manufacturer, model, trans, drv, fl, class)) %>% 
  step_novel(all_nominal(), -all_outcomes()) %>% 
  step_dummy(all_nominal(), -all_outcomes()) %>% 
  step_zv(all_predictors()) %>% 
  step_normalize(all_predictors(), -all_nominal()) 

prep(glmnet_recipe, mpg) %>% juice()
#> Error: object 'manufacturer' not found

devtools::session_info()
#> Warning in system("timedatectl", intern = TRUE): running command 'timedatectl'
#> had status 1
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.0.2 (2020-06-22)
#>  os       Ubuntu 16.04.6 LTS          
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language (EN)                        
#>  collate  C.UTF-8                     
#>  ctype    C.UTF-8                     
#>  tz       Etc/UTC                     
#>  date     2020-10-12                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source        
#>  assertthat    0.2.1      2019-03-21 [1] RSPM (R 4.0.2)
#>  backports     1.1.10     2020-09-15 [1] RSPM (R 4.0.2)
#>  callr         3.4.4      2020-09-07 [1] RSPM (R 4.0.2)
#>  class         7.3-17     2020-04-26 [2] CRAN (R 4.0.2)
#>  cli           2.0.2      2020-02-28 [1] RSPM (R 4.0.2)
#>  codetools     0.2-16     2018-12-24 [2] CRAN (R 4.0.2)
#>  colorspace    1.4-1      2019-03-18 [1] RSPM (R 4.0.2)
#>  crayon        1.3.4      2017-09-16 [1] RSPM (R 4.0.2)
#>  desc          1.2.0      2018-05-01 [1] RSPM (R 4.0.2)
#>  devtools      2.3.2      2020-09-18 [1] RSPM (R 4.0.2)
#>  dials         0.0.9      2020-09-16 [1] RSPM (R 4.0.2)
#>  DiceDesign    1.8-1      2019-07-31 [1] RSPM (R 4.0.2)
#>  digest        0.6.25     2020-02-23 [1] RSPM (R 4.0.2)
#>  dplyr       * 1.0.2      2020-08-18 [1] RSPM (R 4.0.2)
#>  ellipsis      0.3.1      2020-05-15 [1] RSPM (R 4.0.2)
#>  evaluate      0.14       2019-05-28 [1] RSPM (R 4.0.2)
#>  fansi         0.4.1      2020-01-08 [1] RSPM (R 4.0.2)
#>  foreach       1.5.0      2020-03-30 [1] RSPM (R 4.0.2)
#>  fs            1.5.0      2020-07-31 [1] RSPM (R 4.0.2)
#>  generics      0.0.2      2018-11-29 [1] RSPM (R 4.0.2)
#>  ggplot2     * 3.3.2      2020-06-19 [1] RSPM (R 4.0.2)
#>  glue          1.4.2      2020-08-27 [1] RSPM (R 4.0.2)
#>  gower         0.2.2      2020-06-23 [1] RSPM (R 4.0.2)
#>  GPfit         1.0-8      2019-02-08 [1] RSPM (R 4.0.2)
#>  gtable        0.3.0      2019-03-25 [1] RSPM (R 4.0.2)
#>  highr         0.8        2019-03-20 [1] RSPM (R 4.0.2)
#>  htmltools     0.5.0      2020-06-16 [1] RSPM (R 4.0.2)
#>  ipred         0.9-9      2019-04-28 [1] RSPM (R 4.0.2)
#>  iterators     1.0.12     2019-07-26 [1] RSPM (R 4.0.2)
#>  knitr         1.30       2020-09-22 [1] RSPM (R 4.0.2)
#>  lattice       0.20-41    2020-04-02 [2] CRAN (R 4.0.2)
#>  lava          1.6.8      2020-09-26 [1] RSPM (R 4.0.2)
#>  lhs           1.1.1      2020-10-05 [1] RSPM (R 4.0.2)
#>  lifecycle     0.2.0      2020-03-06 [1] RSPM (R 4.0.2)
#>  lubridate     1.7.9      2020-06-08 [1] RSPM (R 4.0.2)
#>  magrittr      1.5        2014-11-22 [1] RSPM (R 4.0.2)
#>  MASS          7.3-51.6   2020-04-26 [2] CRAN (R 4.0.2)
#>  Matrix        1.2-18     2019-11-27 [2] CRAN (R 4.0.2)
#>  memoise       1.1.0      2017-04-21 [1] RSPM (R 4.0.2)
#>  munsell       0.5.0      2018-06-12 [1] RSPM (R 4.0.2)
#>  nnet          7.3-14     2020-04-26 [2] CRAN (R 4.0.2)
#>  parsnip       0.1.3      2020-08-04 [1] RSPM (R 4.0.2)
#>  pillar        1.4.6      2020-07-10 [1] RSPM (R 4.0.2)
#>  pkgbuild      1.1.0      2020-07-13 [1] RSPM (R 4.0.2)
#>  pkgconfig     2.0.3      2019-09-22 [1] RSPM (R 4.0.2)
#>  pkgload       1.1.0      2020-05-29 [1] RSPM (R 4.0.2)
#>  plyr          1.8.6      2020-03-03 [1] RSPM (R 4.0.2)
#>  prettyunits   1.1.1      2020-01-24 [1] RSPM (R 4.0.2)
#>  pROC          1.16.2     2020-03-19 [1] RSPM (R 4.0.2)
#>  processx      3.4.4      2020-09-03 [1] RSPM (R 4.0.2)
#>  prodlim       2019.11.13 2019-11-17 [1] RSPM (R 4.0.2)
#>  ps            1.3.4      2020-08-11 [1] RSPM (R 4.0.2)
#>  purrr         0.3.4      2020-04-17 [1] RSPM (R 4.0.2)
#>  R6            2.4.1      2019-11-12 [1] RSPM (R 4.0.2)
#>  Rcpp          1.0.5      2020-07-06 [1] RSPM (R 4.0.2)
#>  recipes     * 0.1.13     2020-06-23 [1] RSPM (R 4.0.2)
#>  remotes       2.2.0      2020-07-21 [1] RSPM (R 4.0.2)
#>  rlang         0.4.7      2020-07-09 [1] RSPM (R 4.0.2)
#>  rmarkdown     2.4        2020-09-30 [1] CRAN (R 4.0.2)
#>  rpart         4.1-15     2019-04-12 [2] CRAN (R 4.0.2)
#>  rprojroot     1.3-2      2018-01-03 [1] RSPM (R 4.0.2)
#>  scales        1.1.1      2020-05-11 [1] RSPM (R 4.0.2)
#>  sessioninfo   1.1.1      2018-11-05 [1] RSPM (R 4.0.2)
#>  stringi       1.5.3      2020-09-09 [1] RSPM (R 4.0.2)
#>  stringr       1.4.0      2019-02-10 [1] RSPM (R 4.0.2)
#>  survival      3.1-12     2020-04-10 [2] CRAN (R 4.0.2)
#>  testthat      2.3.2      2020-03-02 [1] RSPM (R 4.0.2)
#>  tibble        3.0.3      2020-07-10 [1] RSPM (R 4.0.2)
#>  tidyr         1.1.2      2020-08-27 [1] RSPM (R 4.0.2)
#>  tidyselect    1.1.0      2020-05-11 [1] RSPM (R 4.0.2)
#>  timeDate      3043.102   2018-02-21 [1] RSPM (R 4.0.2)
#>  tune          0.1.1      2020-07-08 [1] RSPM (R 4.0.2)
#>  usemodels     0.0.1      2020-09-22 [1] RSPM (R 4.0.2)
#>  usethis       1.6.3      2020-09-17 [1] RSPM (R 4.0.2)
#>  vctrs         0.3.4      2020-08-29 [1] RSPM (R 4.0.2)
#>  withr         2.3.0      2020-09-22 [1] RSPM (R 4.0.2)
#>  workflows     0.2.1      2020-10-08 [1] RSPM (R 4.0.2)
#>  xfun          0.18       2020-09-29 [1] CRAN (R 4.0.2)
#>  yaml          2.2.1      2020-02-01 [1] RSPM (R 4.0.2)
#>  yardstick     0.0.7      2020-07-13 [1] RSPM (R 4.0.2)
#> 
#> [1] /home/rstudio-user/R/x86_64-pc-linux-gnu-library/4.0
#> [2] /opt/R/4.0.2/lib/R/library

Created on 2020-10-12 by the reprex package (v0.3.0)

Cubist and C50

First things first: this is really great work.
I have not checked up this package in a bit, but if not included already, I would love to see support for Cubist and C50.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.