Giter Site home page Giter Site logo

haowei01 / pytorch-examples Goto Github PK

View Code? Open in Web Editor NEW
170.0 4.0 18.0 224 KB

train models in pytorch, Learn to Rank, Collaborative Filter, Heterogeneous Treatment Effect, Uplift Modeling, etc

Python 99.40% Shell 0.60%
learning-to-rank lambdarank pytorch-implementation pytorch-ranking ranknet ndcg inverse-propensity-score positional-bias uplift-modeling heterogeneous-treatment-effects

pytorch-examples's People

Contributors

haowei01 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pytorch-examples's Issues

Relevance sort direction

Hi! Thanks for this excellent repo, it's very informative.

I've noticed that in your implementation of LambdaRank (specifically, on line 178) you sort the rank data frame by relevance in ascending order (which is the default for pandas' sort_values function).
Every other implementation of LambdaRank I've looked at (allRank, tensorflow-LTR) seems to order relevance levels in a descending order - and I'd have to agree with them (since you want higher relevance items towards the front of the array).

Sorry if I've misunderstood. Thanks.

Data Preprocessing might add more flavour!

Thank you for publishing the great repo and I definitely am learning a lot from this repo!!!
One thing I just noticed when I was working on investigating the dataset(Personalize Expedia Hotel Searches - ICDM 2013) was that some columns contain the large amount of Null so that removing those columns or imputing the missing values might improve the result!

Anyway, I will try myself as well!

Code

def __init__(self):
cur_file = os.path.abspath(__file__)
self.data_dir = os.path.join(os.path.dirname(cur_file), DATA_DIR)
print('{} loading data from DATA dir {}'.format(get_time(), self.data_dir))
pkl_file = os.path.join(self.data_dir, 'train.pkl')
if os.path.isfile(pkl_file):
train_df = pd.read_pickle(pkl_file)
else:
train_df = pd.read_csv(os.path.join(self.data_dir, 'train.zip'))
train_df.to_pickle(pkl_file)
self.random_df = train_df[train_df[RANDOM_COL] == 1]
self.biased_click_df = train_df[train_df[RANDOM_COL] == 0]
print('{} unbiased training data rows {}'.format(get_time(), self.random_df.shape[0]))
print('{} biased training data rows {}'.format(get_time(), self.biased_click_df.shape[0]))
test_pkl_file = os.path.join(self.data_dir, 'test.pkl')
if os.path.isfile(test_pkl_file):
self.test_df = pd.read_pickle(test_pkl_file)
else:
self.test_df = pd.read_csv(os.path.join(self.data_dir, 'test.zip'))
self.test_df.to_pickle(test_pkl_file)
print('{} test file size {}'.format(get_time(), self.test_df.shape[0]))

Data Analysis on the dataset

Something Wrong with decay_diff ?

Hi~Thank you for providing such an excellent work! But I have some questions about the calculation of the "decay_diff" term. ( LambdaRank.py, line 197)

I noticed that you are using "sort_order" as the discounted factor.
Sort_order is the the indices that would sort Y, which means sort_order[0] is the index of the document which is the most relevant to the query. But this is not cosistent with the gain_diff calculation!
Gain_diff (i, j) is the gain difference between douc i and douc j,but decay_diff is not the decay difference between douc i and douc j.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.