Giter Site home page Giter Site logo

Comments (7)

Arsalan-Vosough avatar Arsalan-Vosough commented on May 20, 2024 1

Hi,

I used fit_predict and it worked.

Thanks for your quick response.

from k-means-constrained.

joshlk avatar joshlk commented on May 20, 2024

Hi, Great to hear that your using it 😀.

Can you please provide a minimal working example. Thanks, Josh

from k-means-constrained.

Arsalan-Vosough avatar Arsalan-Vosough commented on May 20, 2024

Longitude Latitude
0 0.143799 0.549696
1 0.748523 0.666809
2 0.893091 0.485969
3 0.633522 0.273117
4 0.691772 0.763385
5 0.671481 0.112269
6 0.250957 0.781550
7 0.199018 0.798926
8 0.680017 0.201779
9 0.270592 0.461235
10 0.648789 0.140139
11 0.417517 0.114667
12 0.733276 0.254028
13 0.283617 0.515177
14 0.256486 0.788757
15 0.369168 0.380070
16 0.265186 0.596243
17 0.356121 0.442192
18 0.651694 0.876345
19 0.166674 0.829551
20 0.623306 0.034364
21 0.250798 0.911847
22 0.448605 0.517670
23 0.529576 0.000000
24 0.622372 0.215839
25 0.492679 0.621276
26 0.349826 0.242467
27 0.561980 0.855117
28 0.543573 1.000000
29 0.000000 0.572787
30 0.285501 0.358724
31 0.398475 0.106590
32 1.000000 0.452500
33 0.367203 0.419650
34 0.672594 0.257735
35 0.590781 0.022893
36 0.459228 0.146675
37 0.480092 0.666456
38 0.451271 0.225341
39 0.767639 0.395854
40 0.702797 0.589130

this is my data, I normalized it with minmax scaler. and used this function in order to clustering :

def k_means_cons(k,minVal,maxVal,data):

clf = KMeansConstrained(
    n_clusters=k,
    size_min=minVal,
    size_max=maxVal,
    random_state=0,max_iter = 300)
clf.fit(data)

clf.cluster_centers_

Label = clf.predict(data)
return Label

Label = k_means_cons(10,3,5,normalized)

and it returns me:

array([0, 9, 7, 1, 9, 8, 5, 5, 1, 4, 8, 3, 1, 4, 5, 4, 0, 4, 6, 5, 8, 5,
2, 8, 1, 2, 3, 6, 6, 0, 4, 3, 7, 4, 1, 8, 3, 2, 3, 7, 9])

as you can see there are 6 elements in 4th cluster

from k-means-constrained.

joshlk avatar joshlk commented on May 20, 2024

Thanks. What exact normalisation did you use?

what sklearn and ortools version are you also using?

from k-means-constrained.

Arsalan-Vosough avatar Arsalan-Vosough commented on May 20, 2024
minmax_scale = preprocessing.MinMaxScaler(feature_range=(0,1))
scaled_feature = minmax_scale.fit_transform(data)

sklearn version is 0.23.2 and ortools version is 8.1.8487

from k-means-constrained.

Arsalan-Vosough avatar Arsalan-Vosough commented on May 20, 2024

i think, I made it complex. Briefly if you run the code below, sometimes it gives you cluster with more than max_size

def generatedb(numberPatient):
    patient=[]
    i = 0
    while len(patient)<=numberPatient:
        x = random.uniform(51.078418,51.701563)
        y = random.uniform(35.514715,35.901148)
        if y < -0.722386*x+72.866:
            if y > -0.7184706*x+72.576:
                if y > 0.692935*x+0.0551:
                    if y<0.549044*x+7.5495:
                        patient.append((x,y))
                        i=i +1
    dataWithcolName =  pd.DataFrame(patient,columns=['Longitude', 'Latitude'])  
    return(dataWithcolName)

def k_means_cons(k,minVal,maxVal,data):

    clf = KMeansConstrained(
        n_clusters=k,
        size_min=minVal,
        size_max=maxVal,
        random_state=0,max_iter = 300)
    clf.fit(data)

    clf.cluster_centers_

    Label = clf.predict(data)
    return Label
data0 = generatedb(38)
Label = k_means_cons(10,3,5,normalized)
Label

array([7, 9, 6, 0, 1, 0, 1, 3, 0, 5, 2, 5, 7, 2, 4, 0, 3, 5, 4, 8, 7, 0,
6, 6, 7, 4, 9, 1, 9, 4, 1, 8, 3, 9, 5, 7, 2, 1, 2, 4, 2, 8, 5, 7])

from k-means-constrained.

joshlk avatar joshlk commented on May 20, 2024

Hi,

I determined what the issue is and it's my fault as the example on the front page of this project is wrong. So thank you for raising the issue.

So you need to use the method fit_predict instead of fit and then predict. This is because predict assigns clusters to the nearest centre without obeying the min and max constrains. While fit_predict does obey the constrains, you can also access the assigned labels using the labels_ attribute after a fit. Like I said, on the front page of this project I use fit and then predict and so this wasn't communicated properly by myself.

Currently, I would say, the predict function does not meet expectations and therefore I have changed it in the latest version so it does obey the obeying the min and max constrains. Therefore if you update to the latest version (v0.5.0) which is on PyPI it should now work.

Thanks again for reporting this,
Josh

from k-means-constrained.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.