gsbdbi / bemb Goto Github PK

View Code? Open in Web Editor NEW

4.0 4.0 2.0 38.4 MB

License: MIT License

Python 9.08% JavaScript 0.01% Jupyter Notebook 90.90% Shell 0.01%

bemb's People

Contributors

Stargazers

Watchers

Forkers

jintao-sun knowledgeleaper

bemb's Issues

Add LBFGS support and cleaner PyTorch lightning training loops

bug fix, `reshape_observable` method in `bemb.py`

The reshape_observable method does not correctly recognize item-session-specific variables when named using the the prefix itemsession_; the prefix price_ works.

The method does not recognize user-item-specific or user-item-session-specific observable names either.

Add MSE minimization along with regularization as an objective

Also allow for Y to be continuous in [0, 1]; in this case the optimization would be through loss minimization and not through variational inference. Note that we still want the sigmoid to be applied as the final head.

@TianyuDu

Improvements on H_zero_mask Option in obs2prior

H_zero_mask Option in obs2prior

Link to notebook (use branch "mask-H-obs2prior")

This tutorial explains how to impose some structure on hierarchical priors. Specifically, it shows how to impose that some entries of the H and W matrix are fixed to 0 and shouldn't be learned. Here are some comments

I think this is a great extension for the obs2prior tutorial. I will use it in the Learning Tools project and, when this is done, we could add a link to the Learning Tools notebook to show a practical example
We could give some context regarding when imposing structure on hierarchical priors is relevant.
- if I understood correctly, this is mostly relevant when the researcher fears there is not enough data for a flexible hierarchical prior and the researcher has some intuition regarding which dim - to be confirmed
We could also give some hints regarding how to choose the structure of the hierarchy
Related to the two points above, we could adapt the numerical example to highlight the benefits from imposing a structure

Originally posted by @charlespebereau in gsbDBI/torch-choice#5 (comment)

Utility formula: we need to be more specific about the model and review the math notation (which is currently incorrect).

Utility formula: we need to be more specific about the model and review the math notation (which is currently incorrect).
- for details, see page 376 of Athey et al (2021)
- the model assumes unit demand for each category, independent choices across categories, and error term distributed according to Guembel distribution (logit)
- there needs to be a discussion on how the outside option is modelled. How does the model choose that the user biuys nothing from a given category? Can we change the value of the outside option in each category or is normalized to 0 for each category?
- Regarding notation: (i) need to index the variables by _{uis} and (ii) decompose the utility into a deterministic part and the error term: $\mathcal{U}{uis} = U{uis} + \varepsilon_{uis}$ .
  Then $P(i|u,s)$ is a function of $U_{uis}$ instead of $\mathcal{U}_{uis}$
- I suggest we write the utility function that the package can accommodate in its most general form (ie. sum all the terms that can be included) and then discuss each term one by one

Refreshing API and

Towards BEMB `v1.0`.

Corresponding branch: api-update

We are planning to refine and expand the current API of BEMBFlex.

The `pred_item` and multiple class prediction.

Note: this is a non-trivial extension: for each observation (consumer pick an item), we compute a scalar utility function for each available item. There is a single scalar utility computed from the chosen item (item_index[i]). This only allows us to do binary classification . We might have to drop this feature.

Currently the model supports predicting binary batch.label or multi-class batch.item_index. We plan to support arbitrary multi-class classifications.
- In particular, you don't need to change anything if pred_item=True, the model will know the number of classes is exactly the num_items parameter. Also, in this case, your ChoiceDataset object does not need to have a label attribute, since the model will look for the item_index as the ground truth for training.
- In contrast, if pred_item=False, now you need to supply a num_classes to the BEMBFlex.__init__() method. Also, you would need a label attribute in the ChoiceDataset object. The label attribute should be a LongTensor with values from {0, 1, ..., num_classes}.

Post-Estimation

Thanks to feedbacks from our valued users, we are planning to reorganize our post-estimation prediction methods for better user experience.
- We will implement a method called predict_proba(), the same name as inference methods of scikit-learn models.
- This method will have @torch.no_grad() as a decorator, so you can use it however you want without being worried about gradient tracking.
- With pred_items = True, the batch needs item_index attribute only if it's involved in the utility computation (e.g., within-category computation).
- With pred_items = False, the batch does not need to have a label attribute.
- The preliminary API of predict_proba() is used as the following:

batch = ChoiceDataset(...)
bemb = BEMBFlex(..., pred_item=True, ...)
proba = bemb.predict_proba(batch)  # shape = (len(batch), num_items)

batch = ChoiceDataset(...)
# not that batch doesn't need to have a label attribute.
bemb = BEMBFlex(..., pred_item=False, num_classes=..., ...)
proba = bemb.predict_proba(batch)  # shape = (len(batch), num_classes)

Renaming Variables.

We received feedbacks that the naming of price-variation is ambiguous, we propose to change it to sessionitem-variation instead (this is precisely the definition of such variables).

A typo in LitBEMBFlex init

I am creating this issue on the behave of @kasince2k

Original post

Hi - it looks like in line 35 of bemb_flex_lightning.py file, there's a small typo (num_needs instead of num_seeds).

Refer to the link here for the file.

Improvements on Tutorial for BEMB with Simulated Data and the obs2prior Option

Tutorial for BEMB with Simulated Data and the obs2prior Option

In this post I review the notebook on obs2prior. Note that I had reviewed the H_zero_mask option which is an extension to this notebook in the post above. Here are some comments

we could add some context regarding when using observables helps and when it doesn't
Relatedly, it would be useful to add a summary of what the tutorial will do before we start the simulations. What are we trying to predict, what do we know about the underlying preferences, what variables do we observe, why in this context we expect using obs2prior will help, etc.
"The observable of a particular user is a one-hot vector with width num_items and one on the position of item this user loves (as mentioned previously)."
- I thought that this was precisely what we don't observe and try to recover
- Tianyu's comment: it is a trivial example to show how to implement the model and show that the model understands that this variable is very important
For internal purposes: often, the user observables relate to demographics (age, gender, income, etc.). Why didn't we choose to simulate this type of situation?
- eg. linear relationship between age and which item the buyer loves
"Fitting the Model"
- it would be useful to talk more about what the package does and what input is necessary. For instance, do we need to set a prior for H and W, does the package sets them by default, does the package tries different ones and selects the best one?

Originally posted by @charlespebereau in gsbDBI/torch-choice#5 (comment)

Improvements on BEMB Tutorial

BEMB Tutorial - link

There is a lot of information on this page and I'm not sure what is the best way to present it - it depends on what the reader is looking for. I'll make some suggestions but will mostly highlight what information I didn't find or understand. Hopefully this will give you ideas and we can also discuss this at some point.

I'm not sure we need an example in the introductory paragraph. Also, the one provided is both more and less general than what the package does: the cdf F is more general than Gumbel distribution but theta*alpha is less general than the utility functions the package can accommodate

Response: I change it to a more general form by saying "the model predicts the probability for user $u$ to purchase item $i$ as an increasing function of $U_{ui}$. Our package support more general form of utility $U_{ui}$ than the inner product of two latent vectors.".

Utility formula: you could be more specific about what utility representation the package allows for. I think (maybe I'm wrong) that
- you only have logistic models (the noise epsilon is constrained to follow a Gumbel distribution)
- utility function is additively separable in the observables and allows for interactions between latents. It should be stressed that this actually very a general form because (i) one can always build sophisticated observables, for instance by taking the log or a polynomial transformation of original observables and (ii) because one can impose that the learnable coefficients depend on i, u, s or any combination of them.

Response: I have added this to our documentation.

Utility formula: we need to be more specific about the model and review the math notation (which is currently incorrect).
- for details, see page 376 of Athey et al (2021)
- the model assumes unit demand for each category, independent choices across categories, and error term distributed according to Guembel distribution (logit)
- there needs to be a discussion on how the outside option is modelled. How does the model choose that the user biuys nothing from a given category? Can we change the value of the outside option in each category or is normalized to 0 for each category?
- Regarding notation: (i) need to index the variables by _{uis} and (ii) decompose the utility into a deterministic part and the error term: $\mathcal{U}{uis} = U{uis} + \varepsilon_{uis}$ .
  Then $P(i|u,s)$ is a function of $U_{uis}$ instead of $\mathcal{U}_{uis}$
- I suggest we write the utility function that the package can accommodate in its most general form (ie. sum all the terms that can be included) and then discuss each term one by one
Subsection "Specifying the Dimensions of Coefficients with the coef_dim_dict dictionary".
- I didn't understand what point 4. refers to. I am not sure it was specified in the "Utility formula" subsection that there can be matrix factorization coefficients for observables

Response: I have added substantive materials explaining the fourth possibility here.

There is a section "Specifying Variance of Coefficient Prior Distributions with prior_variance" but I think there is no section about setting the mean of the coeffs.

Response: unless obs2prior is turned on, we set the distributions to have zero expectations.

Regarding obs2prior:
- there should be a link to the dedicated tutorial in this subsection
Response: added.
- it is not clear whether with obs2prior we impose that the variance is the identity matrix or if we can change that.
Response: with obs2prior, the prior variance is whatever value imposed by the prior_variance above. I made this more clear in the documentation by specifying that "prior_variance term controls the variance of prior distribution and obs2prior term controls the expectation of prior distribution."
- it is not clear whether we can impose some form to H or not
Response: yes, we've added the support for H_zero_mask, which allows the user to force some entries of $H$ to be zeros. I have added a link to this tutorial to in the documentation.
"If category_to_item is not provided (or None is provided), the probability of purchasing item i by user u in session s is ..."
- maybe we could slightly rephrase into saying that by default there is only one category which is all the items. But the package can impose subcategories. In any case, the model is unit demand per category: the user buys at most one item per category
Response: I have updated this by first writing down the $P(i|u,s)$ without category_to_item specified and then showed the user the possibility of normalizing across items in the same category only.

Originally posted by @charlespebereau in gsbDBI/torch-choice#5 (comment)

Support additional prior distributions and variational distributions in BEMB

In this issue, we propose supporting additional distribution families (in addition to the current Gaussian family) as coefficient prior and posterior distributions. For example, using exponential distributions as the prior distribution of a coefficient adds support constraint to the coefficient (i.e., it should be non-negative).

What do you think? @kanodiaayush

Simulated Supermarket Dataset for BEMB Demonstration

This issue tracks using the simulated dataset to build BEMB demonstration.

Adding AUC and F1 score to collection of evaluation metrics

@rachelyangzhou

Issue with `pytorch-lightning` version 1.7.5 and `torchmetrics` version 0.9.3 on Sherlock

I am reporting the issue identified by Rachel @rachelyangzhou please see below for the error message.
The package works with version 1.4.5

Issue 1

>>> import bemb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/bemb/__init__.py", line 2, in <module>
    import bemb.model
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/bemb/model/__init__.py", line 2, in <module>
    from .bemb_flex_lightning import *
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/bemb/model/bemb_flex_lightning.py", line 12, in <module>
    import pytorch_lightning as pl
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/__init__.py", line 34, in <module>
    from pytorch_lightning.callbacks import Callback  # noqa: E402
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/callbacks/__init__.py", line 26, in <module>
    from pytorch_lightning.callbacks.pruning import ModelPruning
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/callbacks/pruning.py", line 30, in <module>
    from pytorch_lightning.core.module import LightningModule
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/core/__init__.py", line 15, in <module>
    from pytorch_lightning.core.datamodule import LightningDataModule
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/core/datamodule.py", line 21, in <module>
    from pytorch_lightning.core.mixins import HyperparametersMixin
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/core/mixins/__init__.py", line 15, in <module>
    from pytorch_lightning.core.mixins.device_dtype_mixin import DeviceDtypeModuleMixin  # noqa: F401
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/core/mixins/device_dtype_mixin.py", line 19, in <module>
    from typing_extensions import Self
ImportError: cannot import name 'Self' from 'typing_extensions' (/share/software/user/open/py-pytorch/1.8.1_py39/lib/python3.9/site-packages/typing_extensions.py)

Issue 2

There is also an issue with the latest torchmetric version 0.9.3, rolling back to version 0.6.0 solves the issue.

Please see below for the error message.

>>> import bemb
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/bemb/__init__.py", line 2, in <module>
    import bemb.model
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/bemb/model/__init__.py", line 2, in <module>
    from .bemb_flex_lightning import *
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/bemb/model/bemb_flex_lightning.py", line 12, in <module>
    import pytorch_lightning as pl
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/__init__.py", line 20, in <module>
    from pytorch_lightning import metrics  # noqa: E402
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/metrics/__init__.py", line 15, in <module>
    from pytorch_lightning.metrics.classification import (  # noqa: F401
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/metrics/classification/__init__.py", line 14, in <module>
    from pytorch_lightning.metrics.classification.accuracy import Accuracy  # noqa: F401
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in <module>
    from pytorch_lightning.metrics.utils import deprecated_metrics, void
  File "/home/users/zhoury/.local/lib/python3.9/site-packages/pytorch_lightning/metrics/utils.py", line 22, in <module>
    from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/users/zhoury/.local/lib/python3.9/site-packages/torchmetrics/utilities/data.py)

Improvement of the utility parsing and utility formula format

We are planning on two improvements to the utility formula, changes were made to the utility parser helper function in the bemb.py file.

Before we used price_obs to denote (item, session)-specific observables; now we are allowing itemsession_obs for a more intuitive utility formula. The updated utility parser allows these two ways simultaneously. For example, writing price_obs * beta_user is the same as writing itemsession_obs * beta_user in the utility_formula.
Coefficients with the same values for all users, items, and sessions are identified by the _constant suffix, now the parser parses all terms without reserved prefixes or suffixes to constant coefficients. This improvement makes the utility formula looks more aligned with the mathematical formulation of utility.

@charlespebereau @kanodiaayush

I have created a branch called formula-parser-update and pull request here for this functionality.

gsbdbi / bemb Goto Github PK

bemb's People

Contributors

Stargazers

Watchers

Forkers

bemb's Issues

H_zero_mask Option in obs2prior

Towards BEMB v1.0.

The pred_item and multiple class prediction.

Post-Estimation

Renaming Variables.

Original post

Tutorial for BEMB with Simulated Data and the obs2prior Option

BEMB Tutorial - link

Issue 1

Issue 2

Recommend Projects

Recommend Topics

Recommend Org

Towards BEMB `v1.0`.

The `pred_item` and multiple class prediction.