Comments (9)
I just pushed a fix to the main branch which should hopefully solve your issue!
from concept.
Apologies for the late reply! It seems that there were no outliers found, which happens very rarely. I'll make sure that it gets fixed!
from concept.
Hi @MaartenGr.
I tried to use concept with Google colab. I did pip install and the concept version is 0.2.1. I still get the KeyError: '[-1] not found in axis'
error when I use a particular dataset. Any ideas on what might be the issue?
Thank you for your time and help.
from concept.
@bakachan19 If you install it through the main branch, it should have the fix for the error you are getting.
from concept.
Oh, I see. Thanks a lot @MaartenGr.
from concept.
Hi @MaartenGr.
I apologize for bothering you again.
I did install the concept package through the main branch and making sure the the scikit-learn version is compatible.
I do not get the previous error anymore, but I do get several different ones depending of the size of the concept. I am using the default concept model configuration, I only change the min_concept_size.
- with min_concept_size = 5, I get the following:
56 try:
---> 57 return bound(*args, **kwds)
58 except TypeError:
59 # A TypeError occurs if the object does have such a method in its
ValueError: attempt to get argmax of an empty sequence
- with min_concept_size = 10, I get this one:
[/usr/local/lib/python3.9/dist-packages/concept/_model.py](https://localhost:8080/#) in <dictcomp>(.0)
353
354
--> 355 selected_exemplars = {cluster: mmr(self.cluster_embeddings[cluster],
356 exemplar_embeddings[cluster],
357 representative_images[cluster]["Indices"],
IndexError: list index out of range
Thank you for your time and help.
from concept.
@bakachan19 Strange, I am not entirely sure what is happening. Could you share your full code and the versions of packages in your environment? I will look into this but just in the meantime, there is an option to use images with BERTopic that should provide similar, albeit not the same, functionality.
from concept.
@MaartenGr I did managed to make it work with different configuration of UMAP: by changing the nr_neighbors from 15 to a smaller number like 5 I was able to run the code with min_concept_size = 10. I think because my data is particular and with some configurations it does not found any clusters or maybe it clusters everything together...
For the environment setup I use google colab with the following installation steps:
pip install scikit-learn==0.24.2
pip install git+https://github.com/MaartenGr/Concept.git
and then I just used the code provided in the tutorial:
from concept import ConceptModel
from umap import UMAP
concept_model = ConceptModel(min_concept_size = 10, umap_model = UMAP(n_neighbors=5, n_components=5, min_dist=0.0, metric='cosine', random_state = 5, low_memory = False))
concepts = concept_model.fit_transform(images_name, docs=all_nouns)
Thank you for your time!
Have a great day.
from concept.
Glad to hear that you solved the issue and thanks for sharing your solution. This will definitely help others having the same issue.
from concept.
Related Issues (19)
- Index Error: index out of bounds error for visualize concepts HOT 7
- OSError: [Errno 24] Too many open files: 'photos/icnZ2R8PcDs.jpg' HOT 3
- ValueError: operands could not be broadcast together with shapes (4,224,224) (3,) HOT 9
- Exemplar dict is not serializable HOT 3
- Multilingual support HOT 3
- TypeError: __init__() got an unexpected keyword argument 'cachedir' HOT 1
- How can we get probabilities for all clusters in transform function? HOT 3
- Saving the model HOT 2
- AttributeError: 'CountVectorizer' object has no attribute 'get_feature_names' HOT 5
- discussion on different concepts results HOT 2
- sentence-transformers version HOT 2
- AttributeError: 'ConceptModel' object has no attribute 'image_cluster_df' HOT 3
- TypeError: Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k. HOT 1
- TypeError: Cannot use scipy.linalg.eigh for sparse A with k >= N. Use scipy.linalg.eigh(A.toarray()) or reduce k. HOT 1
- Questions HOT 4
- Question about the Function transform HOT 7
- Saving the model HOT 2
- Using GPU while processing concepts HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from concept.