Comments (10)
I've made a new release: https://pypi.org/project/dirty-cat/
python -m pip install --upgrade dirty-cat
should fix the problem.
Tell us if it works!
from skrub.
are you using it on your own dataset? If so, may be it is not in the expected datatype (numpy or pandas array)?
from skrub.
Yes, I'm using another dataset (https://www.kaggle.com/osmi/mental-health-in-tech-survey).
The variable I'm passing is a numpy.ndarray.
Follow my code:
dataset_C = pd.read_csv('mental-health-in-tech-survey.csv')
dataset_C.info()
values = dataset_C[['Gender', 'Country', 'state']]
sorted_values = values['Gender'].sort_values().unique()
sorted_values
type(sorted_values)
similarity_encoder = SimilarityEncoder(similarity='ngram')
transformed_values = similarity_encoder.fit_transform(sorted_values.reshape(-1, 1))
And log:
log_dirty_cat_test.txt
from skrub.
I just had the same issue. Have you been able to fix it? I think the problem is from this line:
X = self._check_X(X)
That _check_X
method returns a tuple according to this line from scikit-learn's repo, hence the error.
from skrub.
Unfortunately I didn't have time to fix it and went for another solution to my problem.
But I'm still very interested in this encoder. If I have any news I'll post it here. And please, if you have something too, let me know.
from skrub.
This looks to be fixed in the source (not yet released) version.
from skrub.
from skrub.
Yeah! It's working!
Thanks!
@anhtholee Is it working for you too?
I think we can close this issue, right?
from skrub.
@AC-Meira Yes it worked for me too.
from skrub.
from skrub.
Related Issues (20)
- Test polars in test_deduplicate.py
- `AttributeError` in `SimilarityEncoder` `inverse_transform` HOT 2
- Testing polars support for joiner, fuzzy_join
- DOC: Example 03 Change graph subtitles HOT 1
- (Minor) CSS bug in dark mode in tables, on circleCI HOT 1
- FEAT Add the MultiJoiner and MultiAggJoiner
- ENH Remove the `OneHotEncoder` inheritance `SimilarityEncoder` HOT 2
- support python 3.8 & 3.9 HOT 2
- Follow-up after #742 InterpolationJoin
- Test polars support HOT 1
- Drop numpy array input support for `TableVectorizer`
- `get_feature_names_out` returns lists instead of numpy arrays HOT 2
- datetimeencoder is very slow HOT 3
- TableVectorizer imputing logic is confusing HOT 5
- 2 ways GapEncoder get_feature_name_out is broken on low entropy data
- cannot run test suite in python3.12 due to warnings filter
- Add "to_datetime" to the narrative documentation
- development status in setup.cfg
- Add a "related projects" section in the documentation HOT 1
- Adding a TableVectorizer specialization for HistGradientBoosting HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from skrub.