Comments (3)
I would love to work on both!
About the improved example: I currently cannot open it from the website. If I click on the link at this page nothing happens. Is that because the docs are not in sync with the latest commits, am I right? EDIT it's #716
As a reference, I guess the source code for the example you mention is here.
EDIT2: I was thinking about creating a custom function transformer to generate holiday features following the example Olivier wrote here. What do you think Vincent?
from skrub.
Hey @baggiponte, thanks for the suggestion!
- The weekend binary variable would bring value and is straightforward to implement!
- The holidays package would add a new runtime dependency, and we prefer avoiding that.
Instead, we could improve one of our existing examples (and preferably increase its classification or regression score) using the holidays package as a documentation dependency. This way, we would demonstrate its usefulness while showcasing its usage with skrub.
Would you be interested in implementing one of those (or both)?
from skrub.
-
Yes indeed, the documentation is being fixed, but you can access the source code file you mentioned.
This file is actually generated via nbconvert, we sync a notebook to a python file, so that changes in the notebook are reflected in the file. We then manually process it to make it "sphinx compatible", i.e. we replace
#%
with#######
for cell breaks.Therefore, I suggest you reproduce example 03 in your own notebook, make the appropriate changes, and put these changes into the source file, trying to respect the sphinx syntax. I'll help you debug it if needed.
-
Seems like a good idea to use a light
FunctionTransformer
, go ahead!
from skrub.
Related Issues (20)
- cannot run test suite in python3.12 due to warnings filter
- Add "to_datetime" to the narrative documentation
- development status in setup.cfg
- Add a "related projects" section in the documentation HOT 1
- Adding a TableVectorizer specialization for HistGradientBoosting HOT 1
- allowing to use a different distance for the nearest neighbors in fuzzy join HOT 1
- Consider casting to float32 by default in TableVectorizer HOT 3
- Handle numerical missing values in TableVectorizer HOT 8
- Basic regression problem raises exception on inference HOT 4
- TableVectoriser's "numerical_transformer" does not accept Pipelines HOT 3
- fetch_ken_types gives same results for many embedding_table_id's HOT 2
- Potential performance issue: .to_dict method slow in pandas below 2.2 HOT 2
- Polars deprecation inbound HOT 4
- InterpolationJoiner - polars HOT 1
- packaging>=23.1 - can it move to dev instead of install_required ? HOT 2
- incorrect github action matrix description
- Failing fetch ken embeddings test HOT 2
- Should accept pipelines and transformers HOT 1
- TableVectorizer raises when a categorical column contains `pd.NA` HOT 1
- Add features to the `DatetimeEncoder` HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from skrub.