Giter Site home page Giter Site logo

Comments (9)

MaartenGr avatar MaartenGr commented on September 16, 2024 1

I just pushed a fix to the main branch which should hopefully solve your issue!

from concept.

MaartenGr avatar MaartenGr commented on September 16, 2024

Apologies for the late reply! It seems that there were no outliers found, which happens very rarely. I'll make sure that it gets fixed!

from concept.

bakachan19 avatar bakachan19 commented on September 16, 2024

Hi @MaartenGr.
I tried to use concept with Google colab. I did pip install and the concept version is 0.2.1. I still get the KeyError: '[-1] not found in axis' error when I use a particular dataset. Any ideas on what might be the issue?

Thank you for your time and help.

from concept.

MaartenGr avatar MaartenGr commented on September 16, 2024

@bakachan19 If you install it through the main branch, it should have the fix for the error you are getting.

from concept.

bakachan19 avatar bakachan19 commented on September 16, 2024

Oh, I see. Thanks a lot @MaartenGr.

from concept.

bakachan19 avatar bakachan19 commented on September 16, 2024

Hi @MaartenGr.
I apologize for bothering you again.
I did install the concept package through the main branch and making sure the the scikit-learn version is compatible.
I do not get the previous error anymore, but I do get several different ones depending of the size of the concept. I am using the default concept model configuration, I only change the min_concept_size.

  • with min_concept_size = 5, I get the following:
     56     try:
---> 57         return bound(*args, **kwds)
     58     except TypeError:
     59         # A TypeError occurs if the object does have such a method in its

ValueError: attempt to get argmax of an empty sequence
  • with min_concept_size = 10, I get this one:
[/usr/local/lib/python3.9/dist-packages/concept/_model.py](https://localhost:8080/#) in <dictcomp>(.0)
    353      
    354 
--> 355         selected_exemplars = {cluster: mmr(self.cluster_embeddings[cluster],
    356                                            exemplar_embeddings[cluster],
    357                                            representative_images[cluster]["Indices"],

IndexError: list index out of range

Thank you for your time and help.

from concept.

MaartenGr avatar MaartenGr commented on September 16, 2024

@bakachan19 Strange, I am not entirely sure what is happening. Could you share your full code and the versions of packages in your environment? I will look into this but just in the meantime, there is an option to use images with BERTopic that should provide similar, albeit not the same, functionality.

from concept.

bakachan19 avatar bakachan19 commented on September 16, 2024

@MaartenGr I did managed to make it work with different configuration of UMAP: by changing the nr_neighbors from 15 to a smaller number like 5 I was able to run the code with min_concept_size = 10. I think because my data is particular and with some configurations it does not found any clusters or maybe it clusters everything together...
For the environment setup I use google colab with the following installation steps:

pip install scikit-learn==0.24.2
pip install git+https://github.com/MaartenGr/Concept.git

and then I just used the code provided in the tutorial:

from concept import ConceptModel
from umap import UMAP

concept_model = ConceptModel(min_concept_size = 10, umap_model = UMAP(n_neighbors=5, n_components=5, min_dist=0.0, metric='cosine', random_state = 5, low_memory = False))

concepts = concept_model.fit_transform(images_name, docs=all_nouns)

Thank you for your time!
Have a great day.

from concept.

MaartenGr avatar MaartenGr commented on September 16, 2024

Glad to hear that you solved the issue and thanks for sharing your solution. This will definitely help others having the same issue.

from concept.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.