Giter Site home page Giter Site logo

Comments (18)

talgalili avatar talgalili commented on June 8, 2024

Hey @geniusjenny

Thanks for the bug report!

Could you please try to run the code from the rake tutorial:
https://import-balance.org/docs/tutorials/quickstart_rake/
And see if you can reproduce the code from it?

What would help me is a fully self-contained reproducible example that I could run in my env to reproduce the error - that would allow me to more easily iterate to get a solution.

Thanks upfront!

from balance.

geniusjenny avatar geniusjenny commented on June 8, 2024

Thanks for the replies!
For the sample code it runs smoothly with no error.
image (7)

from balance.

talgalili avatar talgalili commented on June 8, 2024

Thanks for checking @geniusjenny
Any way you could play around and try to find a way to reproduce the issue?
I suggest you look at the
sample.df.info()
And look at the data types, and maybe the hint could be there.

Once you could find a way to reproduce the issue, I'd be able to work on it.
WDYT?

from balance.

geniusjenny avatar geniusjenny commented on June 8, 2024

Hi talgalili, I tried to reproduce the issue but couldn't. I tried using two numerical features ['income', 'happiness'] similar with what I have for my dataset, and the code runs smoothly.
I attached the sample data here for you to reproduce the issue. Sorry that I couldn't be more helpful.

Thank you so much.
sample_test2.csv
target_test2.csv
code:

s2= pd.read_csv('sample_test2.csv',index_col=0)
t2= pd.read_csv('target_test2.csv',index_col=0)
sample = Sample.from_frame(s2)
target = Sample.from_frame(t2)
sample_with_target = sample.set_target(target)
adjusted_ads_weight1 = sample_with_target.adjust(method = "rake") 

from balance.

talgalili avatar talgalili commented on June 8, 2024

Thanks @geniusjenny

Just to double check, could you please paste the full output of you running the above code?
And please also include the output of:
sample.df.info()
target.df.info()

Thanks!

from balance.

geniusjenny avatar geniusjenny commented on June 8, 2024

Sure!
Full output:
image (8)
image (4)
image (5)

df.info:
image (9)

from balance.

talgalili avatar talgalili commented on June 8, 2024

from balance.

geniusjenny avatar geniusjenny commented on June 8, 2024

Hi talgalili,
I just tried binning the numerical variables to categorical variables, but still the code returns the same error. While method='cbps' and method = 'ipw' run smoothly.

Here are the code and df.info:
image (10)
ERROR:
image (11)

from balance.

talgalili avatar talgalili commented on June 8, 2024

Thanks @geniusjenny
Interesting!
Could you please change the object type of the bucketed variables from 'categorical' to 'object'? And let me know if this resolve the error you get?

from balance.

geniusjenny avatar geniusjenny commented on June 8, 2024

I also tried that. Still getting the same error.

image

from balance.

geniusjenny avatar geniusjenny commented on June 8, 2024

I think I may find the issue.
Some of the bin that appears in the sample has never appeared in the target, causing this error.
Once I add the sample to the target, the bug disappear.
I suggest the code take this edge case in consideration as well!

t2=pd.concat([s2,t2])
t2.reset_index(inplace=True)
t2['id']=t2.index.astype('str')
image

from balance.

talgalili avatar talgalili commented on June 8, 2024

Great catch - thanks a bunch @geniusjenny !

O.k., I'll leave this issue open - and we'll get to add a proper exception in the future.

Thanks again.

from balance.

geniusjenny avatar geniusjenny commented on June 8, 2024

Thank you!

from balance.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.