Comments (7)
TaskRegr:meuse has the following unsupported feature types: factor
That means that the learner you are trying to apply does not support factor features. You must convert them to numeric first or enhance your learner with one-hot encoding using mlr3pipelines:
library(mlr3pipelines)
learner = po("encode") %>>% learner
from mlr3.
@giuseppec @mllg I tried to train a regression model where target is numeric but covariates are a mix of categorical and numerical var and I face the error as the following:
Error: <TaskRegr:meuse> has the following unsupported feature types: factor
Can you please help me here?
from mlr3.
@mohammadreza-sheykhmousa @giuseppec @mllg I have the same problem, how to create regression task with numeric target (of course) and categorical covariates? I would be very grateful for your help.
from mlr3.
- @giuseppec There is nothing wrong to use even linear regression on classification.
- @mohammadreza-sheykhmousa @frdanconia for a factor variable of n level, create n-1 binary variables as indicator variables
from mlr3.
I have the same problem, how to create regression task with numeric target (of course) and categorical covariates? I would be very grateful for your help.
Creating such a task should be no problem, only some learners do not like the factor features. If you can give an example where the task creation fails, please reopen.
from mlr3.
Hi @mllg yes you're right! I solved the problem as the following:
gr = pipeline_robustify(tsk_rgr, lrn) %>>% po("learner", lrn)
ede = resample(tsk_rgr, GraphLearner$new(gr), rsmp("holdout"))
tsk_regr1 = ede$task$clone()
tnx to pipeline_robustify
from mlr3.
Unfortunately I have the same problem with mgcv::gam. Whereas the original model itself can process factors. The option with mlr3pipelines::po
did not help me. Below is an example.
library(dplyr)
library(mlr3)
library(mlr3extralearners)
library(mlr3pipelines)
library(mgcv)
# Example from here:
# https://mlr3extralearners.mlr-org.com/reference/mlr_learners_classif.gam.html
# ... get data
t <- mlr3::tsk("spam")
t_data <- t$data()
# ... (re-)create the task
t_re <- as_task_classif(
id = "spam",
target = "type",
positive = "spam",
x = t_data
)
# ... init mgcv gam learner
l <- mlr3::lrn("classif.gam",
formula = type ~ s(george) + s(charDollar) + s(edu) + ti(george, edu))
# ... train and get gam
l$train(t_re)
l$model # ... creates some output -> example of mlr3-page successfully reproduced
# NOW, the error due to the created factor
set.seed(123)
t_data_fac <- t_data %>%
dplyr::mutate(fac = sample(x = c(1:3), size = nrow(t_data), replace = T) %>% as.factor())
# create task with additional factor variable
t_fac = as_task_classif(
id = "spam_fac",
target = "type",
positive = "spam",
x = t_data_fac
)
# ... init mgcv gam learner with factor
l_fac <- mlr3::lrn("classif.gam",
formula = type ~ s(george) + s(charDollar) + s(edu) + ti(george, edu) + fac)
# ... train and get gam
l_fac$train(t_fac)
# Error: <TaskClassif:spam_fac> has the following unsupported feature types: factor
# Here is the page of mlr3-implementation of mgcv::gam
# https://mlr3extralearners.mlr-org.com/reference/mlr_learners_classif.gam.html
# Perhaps in Feature Types: “logical”, “integer”, “numeric” --> "factor" is missing?
# ... but normally, factors are no issues for mgcv::gam
l_gam <- mgcv::gam(formula = type ~ s(george) + s(charDollar) + s(edu) + ti(george, edu) + fac,
data = t_data_fac, family = "binomial")
l_gam %>% summary()
# Parametric coefficients:
# Estimate Std. Error z value Pr(>|z|)
# (Intercept) 2.441e+03 2.479e+05 0.010 0.992
# fac2 4.675e-02 9.814e-02 0.476 0.634
# fac3 5.879e-02 9.689e-02 0.607 0.544
# Now, trying mlr3pipelines solution, but it does not work
l_fac_enc <- mlr3pipelines::po("encode") %>>%
mlr3::lrn("classif.gam", formula = type ~ s(george) + s(charDollar) + s(edu) + ti(george, edu) + fac)
l_fac_enc$train(input = t_fac)
# Error in eval(predvars, data, env) : object 'fac' not found
# This happened PipeOp classif.gam's $train()
from mlr3.
Related Issues (20)
- Measure's check_prerequisites is ignored when calling `$score()` on a ResampleResult
- NumFOCUS funding HOT 1
- ResampleResult and BenchmarkResult's `$score()` behave surprisingly when passing a `predict_set`
- Release mlr3 0.18.0
- Example task with non-standard primary key
- Task cbind breaks when task's backend has primary_key different to `..row_id` HOT 3
- error message when using examples from mlr3 book HOT 1
- Save only selected edges in graph learner HOT 4
- Release mlr3 0.17.0
- "classif.svm" and "classif.regr" not in the key of as.data.table(mlr_learners) HOT 2
- fallback learner should maybe be a warning HOT 1
- Error in benchmark_grid A Resampling is instantiated for a task with a different number of observations HOT 4
- why mlr3 randomforest importance is different from randomForest package HOT 2
- i am sorry i do not know how to delete it
- who is author of Resampling? HOT 6
- Release mlr3 0.17.1
- resample() does not set data_prototype (and task_prototype), which some learners rely on HOT 6
- get column names used to train a learner? HOT 2
- Measure Documentations could be improved
- predict_time can be (kind of) wrong
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlr3.