Giter Site home page Giter Site logo

pymc-labs / pymc-marketing Goto Github PK

View Code? Open in Web Editor NEW
523.0 22.0 108.0 198.25 MB

Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.

Home Page: https://www.pymc-marketing.io/

License: Apache License 2.0

Makefile 0.07% Python 99.87% Dockerfile 0.06%
clv data-science marketing mmm python btyd customer-lifetime-value media-mix-modeling buy-till-you-die

pymc-marketing's Introduction

PyMC Labs

Connect with us:

pymc-marketing's People

Contributors

abdalazizrashid avatar alexandorra avatar ameynen avatar cetagostini avatar cluhmann avatar coltallen avatar drbenvincent avatar ferrine avatar garve avatar giuliacaglia avatar juanitorduz avatar konkinit avatar larryshamalama avatar lucianopaz avatar maresb avatar markussagen avatar michaelraczycki avatar mustaphau avatar nialloulton avatar oriolabril avatar pre-commit-ci[bot] avatar ricardov94 avatar sangamswadik avatar takechanman1228 avatar tomicapretto avatar twiecki avatar ulfaslak avatar vincent-grosbois avatar wd60622 avatar xhulianothe1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pymc-marketing's Issues

Marginalization over dichotomous variable

I know that this is ongoing work raised in issue 21 of AePPL, but is there a quick way to marginalize over a discrete parameter? Akin to the example provided in the issue linked above:

$$p(Y=y | X=0) * p(X=0) + p(Y=y | X=1) * p(X=1)$$

for some continuous $Y$ and dichotomous $X$. In this case, $X$ would be the variable indicating a customer churning or not and the difference between a contractual and non-contractual likelihood would be that one is the marginalized version of the other. I have not revisited the math in several weeks, but I'm fairly certain of this and the ability to marginalize likelihoods as such may provide better model building blocks rather than define a distribution class for each quadrant. Just an idea so far...

Add `setup.py`

Add setup.py so that we can install the python package via python -m pip install -e .

MMM `adstock_max_lag` and `control_data` are model specific

If we intend our MMM class to be a suitable generic base for many media mix models, we should have as generic arguments to __init__ as possible. adstock_max_lag is completely specific to a particular way to apply a convolution operation on a vector, and control_data might be handled differently by different MMM subclasses (e.g. it could handle continuous controls differently than categorical controls). We should remove as much model specific arguments from the __init__ signature, and leave the rest as kwargs that are forwarded into _build_model

add black[jupyter] to the environment

Now we have notebooks, it would be useful to add black[jupyter] to the environment. I have no understanding of how to play with commit hooks, but maybe it's relevant to add in there too?

Add ROAS plot

Add a method to compute the return of ad spend for certain channels and plot it. The plot could follow the style of figure 3 from Jin et al 2017, which I'll copy down here just as a reference.

image

Add study case with MMM + CLV

Create some fake dataset with continuous non-contractual process and build a story around:

  1. Using MMM to infer cost of acquisition across different channels
  2. Using CLV to infer differential lifetime value of customers coming from different channels
  3. Making business decision that takes the two sources of information into account
    a. Preferring to invest in a channel with higher CAC because of higher CLV
    b. Binary decision to not further invest in channel if CLV is lower than CAC

Requires: #24, #19, #39

We might use this case-study for the initial announcement of the package

Take the CLV grid to the next level

The idea is to follow up to #25, and extend the basic models in interesting ways, e.g., by adding hierarchical effects, time-varying effects... and so on. This should give us a more refined idea not only of what building blocks we need, but how flexible they should be.

This will, potentially, also be the biggest selling point of the package, as we will be doing things that are not really done out there (or at least not published in neat papers / packages), in large part by taking advantage of working with a fully-fledged PPL (PyMC!)

image

Unlike #25, these squares are not yet fixed, and any cool idea you have can be used.

  • Continuous Non-contractual + Hierachical structure (#39) (up for grabs)
  • Continuous Contractual + ??? (up for grabs)
  • Discrete Non-contractual + ??? (up for grabs)
  • Discrete Contractual + cohort / temporal effects (suggested to @drbenvincent), see #35

Possible the grid won't be over the 4 types of models, but perhaps over extensions:

  • Cohort + Temporal effects on lifetime
  • Cohort + Temporal effects on value
  • Complex interactions between Lifetime and Value components
    • E.g., subscription fee affects churn-rate and value, @juanitorduz brought something that resonates with this
  • THE NEXT BANG IN MARKETING MODELS ???

Explore variants of the shifted Beta Geometric model

From: Fader, P. S., & Hardie, B. G. (2007). How to project customer retention. Journal of Interactive Marketing, 21(1), 76-90. pdf

They mention this other model derived from the beta-binomial, which is conceptually equivalent:

Their model is based on assumptions simi-
lar to those behind the sBG model: (a) Each person
responds to a direct-mail solicitation with constant
probability p, and (b) p varies across the population
according to a beta distribution. While BM base their
framework on the beta-binomial model, it could have
been derived as an sBG model (e.g., the mailing on
which the prospect responds to the offer is character-
ized by the shifted-geometric distribution). As such, it
is possible to identify clear relationships between
some of the results in this article [e.g., rt and S(t)] and
some quantities of interest in a list-falloff setting.

Then extensions with cohort covariates:

The BM framework was extended by Rao and Steckel
(1995) to incorporate (time-invariant) descriptor
variables such as age, income, and sex. This is accom-
plished using the beta-logistic model (Heckman &
Willis, 1977),

Incorporating the effects of time-
varying covariates (e.g., marketing-mix effects, sea-
sonality) is more complicated. The key is to bring in
all of these factors at the right level; that is, at the
level of the latent parameter of interest (in this case,
�) instead of just “jamming” different covariate effects
into a regression-like model (see Schweidel, Fader, &
Bradlow, 2006, for a discussion of how to do this in a
continuous-time contractual setting.)

And extensions with time effets:

Both the sBG model and its continuous-time analog
(i.e., the EG model) are based on the assumption that
the commonly observed phenomenon of increasing
retention rates is due entirely to heterogeneity;
individual-customer-level retention rates are assumed
to be constant. If we wish to allow for the possibility of
time dynamics at the level of the individual customer,
we can no longer characterize the duration of an indi-
vidual’s relationship with the firm using either the
shifted-geometric or exponential distributions, both of
which have the “memoryless” property (i.e., the proba-
bility of survival to s � t, given survival to t , is the
same as the initial probability of survival to s ). In a
continuous-time setting, we can accommodate this
effect by assuming that individual lifetimes can be
characterized by the Weibull distribution, which allows
for an individual’s risk of canceling a contract to
increase or decrease as the length of the relationship
with the firm increases. In a discrete-time contractual
setting, this leads to the beta-discrete-Weibull (BdW)
model (Fader & Hardie, 2006), which is a generaliza-
tion of the sBG model, while in a continuous-time con-
tractual setting, this leads to a generalization of the EG
model, the Weibull-gamma (WG) model (Hardie et al.,
1998; Morrison & Schmittlein, 1980).

Adstock transformation without `for` loop

We would like to write a more efficient implementation of the adstock transformations so that we do not use a for loop. An attempt with scan was (unsuccessfully) implemented in #15

Requirements:

  • We should be able to add the l_max parameter to truncate the size of the effect.
  • Be vectorised
  • Should be faster than the current implementation.

New default sampler for `ContNonContract` when there are no observations

Currently, the intention is to primarily use ContNonContract to perform inference on observational data. However, without the observed= keyword, our samplers will misbehave as value = [t_x, x] for t_x being the time of the xth observation with x being an integer.

A moment method would be beneficial, but careful thought must be put into the sampler.

MMM Example Notebook: Time Varying Coefficients

To prove the flexibility and potential of framework (namely, bayesian stats and pymc), we would like to have an example notebook to illustrate how to extend the base model (based on simulated data) introduced in #41 by allowing time varying coefficients via gaussian processes.

Implement pre-built `BG/NBD`/`BetaGeoFitter` model

It would be good to add a BetaGeoFitter function that returns a ContNonContract with some default priors. A signature that resembles what is provided in the lifetimes package would be a good idea. Something along the lines of the following snipet.

def BetaGeoFitter(name, a, b, r, alpha, T, T0, *, observed, **kwargs):
    p = pm.Beta(f"{name}_beta", a, b, size=size, shape=shape)
    lam = pm.Gamma(f"{name}_gamma", r, 1/alpha, size=size, shape=shape)
    return ContNonContract(name, lam, p, T, T0, size=size, shape=shape, **kwargs)

We should also add some useful summary stats / plots. If they are not specific to the BG/NBD the better!

Decide how to handle `date_column` dtype and transformations

At the moment, the date_column is left untouched in the data. This might be ok, but it would be great to support datetime dtypes as inputs. Those can be easily handled in plots of time series, and they are more natural to reason about as opposed to floating point arrays referenced to some onset event and using some unknown resolution in some unknown time zone. The difficulty of supporting datetime dtypes is that we need to:

  1. Ensure that the date_column is a date time compatible dtype
  2. Convert it to a datetime if it isn't using some string format?
  3. Add a transformation to go from datetimes to floating point arrays and back. This is necessary if we want to have time series with seasonal components or other mathematical dependencies on time.

Add study case with alternative to gamma-gamma model

If we don't summarize individual transaction values, there should be much more flexibility in how to model user latent "spend", with e.g, timeseries component, glm predictors, ....

Would be nice to add a study case of such, perhaps motivating new summary/plotting/prediction functionality of the library.

Should unit tests live outside of the package?

We are shipping our the tests folder inside of the pymmmc package. I don't think is is necessary or desirable. I propose that we move the tests folder to the root directory instead.

Current organization:
root/
└─> pymmmc/
└─> tests/

Proposed:
root/
├─> pymmmc/
└─> tests/

Add delayed adstock function

Add delayed adstock function from Jin, Yuxue, et al. "Bayesian methods for media mix modeling with carryover and shape effects." (2017).

Add notebooks to fill the basic CLV grid

The idea is to write a notebook with pure PyMC model(s) for each of these CLV scenarios. We can start with the Lifetime part (not-value yet), but ideally we will include value as well by the end. This might be a constant with time-decay penalty in the simplest cases.

Hopefully this will give us a good picture of the building blocks that are necessary for a minimum viable package, and can also serve as the base documentation. Overtime we would replace the custom PyMC code with imports from the CLV sub-package.

image

  • Continuous Non-contractual #16
  • Continuous Contractual #36
  • Discrete Non-contractual (up for grabs)
  • Discrete Contractual #32

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.