tidymodels / modeldatatoo Goto Github PK
View Code? Open in Web Editor NEWMore Data Sets Useful for Modeling Examples
Home Page: https://modeldatatoo.tidymodels.org
License: Other
More Data Sets Useful for Modeling Examples
Home Page: https://modeldatatoo.tidymodels.org
License: Other
it should be split into two prompts instead of one
The Parkinson's disease data set from Sakar et al (2019). The data are from the UCI ML repository and data manipulation code here
Prepare for release:
git pull
urlchecker::url_check()
devtools::build_readme()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
revdepcheck::cloud_check()
cran-comments.md
git push
Submit to CRAN:
usethis::use_version('minor')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)
2023
Necessary:
person(given = "Posit Software, PBC", role = c("cph", "fnd"))
use_mit_license()
use_tidy_logo()
usethis::use_tidy_coc()
usethis::use_tidy_github_actions()
Optional:
pak::pak("org/pkg")
over devtools::install_github("org/pkg")
in READMEuse_tidy_dependencies()
and/or replace compat files with use_standalone()
use_standalone("r-lib/rlang", "types-check")
instead of home grown argument checkersThere are two different feature sets (one has many morphology features, and another is a few descriptive morphology terms).
Following up on #10.
Are we game for renaming the chicago_taxi
data? Given that we will end up prefixing and suffixing the name of the data throughout analyses, it might be nice if the objects name were a bit shorter. I'd suggest taxi
but am game for other ideas too. :)
should be done by the next release
In my review of #28 I failed to catch that we need to remove/alter the docs for columns that no longer exist in the main version🫢🤠
Prepare for release:
git pull
urlchecker::url_check()
devtools::build_readme()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
revdepcheck::cloud_check()
cran-comments.md
git push
Submit to CRAN:
usethis::use_version('minor')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)
First release:
usethis::use_news_md()
usethis::use_cran_comments()
Title:
and Description:
@return
and @examples
Authors@R:
includes a copyright holder (role 'cph')Prepare for release:
git pull
urlchecker::url_check()
devtools::build_readme()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
git push
Submit to CRAN:
usethis::use_version('minor')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)
usethis::use_news_md()
Following up on #10.
The variable names in the Chicago taxi data are:
dplyr::glimpse(
pins::pin_read(modeldatatoo:::modeldatatoo_board, "chicago_taxi")
)
#> Rows: 10,000
#> Columns: 14
#> $ tip <fct> yes, no, yes, no, yes, yes, yes, yes, yes, no, yes, y…
#> $ trip_id <fct> 9fee331b5a4b19daa19a149cfdfbeea91eb4dff9, b1dbc452aeb…
#> $ trip_seconds <dbl> 333, 2692, 1076, 1599, 346, 1790, 885, 720, 2190, 216…
#> $ trip_miles <dbl> 1.24, 5.39, 3.01, 18.38, 1.76, 13.65, 3.71, 4.80, 18.…
#> $ fare <dbl> 6.50, 25.50, 11.23, 45.25, 7.50, 34.25, 11.25, 14.75,…
#> $ tolls <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
#> $ extras <dbl> 0.0, 0.0, 0.0, 0.0, 1.0, 4.5, 0.0, 1.5, 4.0, 4.0, 5.0…
#> $ trip_total <dbl> 8.00, 25.50, 14.10, 45.25, 11.00, 49.06, 12.62, 19.60…
#> $ payment_type <fct> Credit Card, Prcard, Mobile, Prcard, Credit Card, Cre…
#> $ company <fct> Sun Taxi, Flash Cab, City Service, Sun Taxi, Sun Taxi…
#> $ local_trip <fct> no, no, no, no, no, no, no, no, no, yes, no, NA, no, …
#> $ trip_start_dow <fct> Thu, Sat, Wed, Sat, Sun, Mon, Mon, Tue, Fri, Thu, Tue…
#> $ trip_start_month <fct> Feb, Mar, Feb, Apr, Jan, Feb, Mar, Mar, Jan, Apr, Apr…
#> $ trip_start_hour <int> 13, 12, 17, 6, 15, 17, 21, 9, 19, 12, 20, 22, 10, 10,…
Created on 2023-06-30 with reprex v2.0.2
The unit of observation in the data is a trip; would we be up for removing the trip_
prefix from those variables that have it? It might make sense in that case to rename total
to total_cost
.
I proposed in #14 that we document trip_start_dow
, trip_start_month
, and trip_start_hour
in such a way that mentions these are the values at the start of the trip; would we be up for removing the start_
prefix from those variables that have it?
Altogether, these changes should make for more concise variable names.
Prepare for release:
git pull
urlchecker::url_check()
devtools::build_readme()
devtools::check(remote = TRUE, manual = TRUE)
devtools::check_win_devel()
git push
Submit to CRAN:
usethis::use_version('patch')
devtools::submit_cran()
Wait for CRAN...
usethis::use_github_release()
usethis::use_dev_version(push = TRUE)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.