Giter Site home page Giter Site logo

facebookincubator / geolift Goto Github PK

View Code? Open in Web Editor NEW
160.0 160.0 53.0 52.99 MB

GeoLift is an end-to-end geo-experimental methodology based on Synthetic Control Methods used to measure the true incremental effect (Lift) of ad campaign.

Home Page: https://facebookincubator.github.io/GeoLift/

License: MIT License

R 94.88% JavaScript 4.21% CSS 0.90%

geolift's People

Contributors

arturoesquerra avatar gufengzhou avatar jussann avatar laresbernardo avatar michael-khalil avatar nicolasmatrices-v2 avatar raphaeltamaki avatar subramen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

geolift's Issues

increase budget of a campaign and covariates

Bug description

  • What type of covariates can I use in GeoLift? Can I have an example of the value to set?

  • If I want just to say the lift of a campaign if I increase the budget, how should I set the MarketSelection?

Expected behavior

Give a clear and concise description of what you expected to happen.

Output goes here

Additional context

Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Feature Request: Handling of multiple test cells

A simple test is a basic A/B, a single or few test geos vs. a single or few control geos. However, we often want to test variations, such as 50% more spend and 100% more spend, each vs. control, or Radio vs. TV. vs. both vs. control.

In these cases, to help with planning, the tool could either ask for # of test conditions needed and provide either a) test and control sets, deduping test geos (this means multiple control measurements, as they may all be different for each test set), or b) test sets that are equivalent enough to be compared, and a single (synthetic) control which can be used as a baseline for all conditions.

I'm assuming this is a natural extension of the current tooling, but I could be (very) wrong.

Understand of the GeoLift model

Hi there,

I was able to run my first model with Geolift but I have a few questions for you.

  1. When I set up the model, can I use it for both these kinds of experiments?
  • Increase in budget spending for a campaign/channel that is already running
  • Test a new channel/campaign
  1. When I use the horizon variable, it showed up this error:
    Errore in GeoLiftPower.search(data = GeoTestData_PreTest, treatment_periods = c(15), :
    argomento non utilizzato (horizon = 50)

Can you explain to me why?

Error running NumberLocations function

Hi GeoLift team,

I'm having problem running the GeoLift Walkthrough, with the function NumberLocations:

`
library(GeoLift)

data(GeoLift_PreTest)

GeoTestData_PreTest <- GeoDataRead(data = GeoLift_PreTest,
date_id = "date",
location_id = "location",
Y_id = "Y",
format = "yyyy-mm-dd")
head(GeoTestData_PreTest)

GeoPlot(GeoTestData_PreTest,
Y_id = "Y",
time_id = "time",
location_id = "location")

resultsNum <- NumberLocations(data = GeoTestData_PreTest,
Y_id = "Y",
location_id = "location",
time_id = "time",
n_sim = 500,
treatment_periods = 15,
plot = TRUE,
power = 0.8,
alpha = 0.1,
fixed_effects = TRUE,
ProgressBar = TRUE)
`

image

Using GeoLift to detect a metric drop given an intervention

Hey FB team,

I'm Leo from Babylon Health. Thanks for building this package 🙏 . We have been doing geo-experiments with CausalImpact/diff-in-diff and we are now excited to try the FB solution.

##Question

is the hypothesis testing one-tailed or two-tailed? Can I use it to detect a drop in conversions given an intervention (stop of a "always on" historical marketing campaign")?

##Use case details:

  • Granularity of geo: zipcode
  • metric: conversion
  • the campaign has been running for almost a year (can be considered the steady-state)

Thanks in advance for your help

varying level of treatment over time

Hi all,
thanks for the great package.
The company i work for is a B2B so during the week-end and bank holidays business users are less active and media reach of business users is lower.

In the scenario of the treatment being "paused" (the "pause" might be a true pause in media treatment or due to a significant reach drop due to characteristic of the media & target under experimentation) for example during a bank holiday or the week end... or long week-ends and these happening during the experiment treatment period.
Do you have any recommendation on handling these scenarios?

Speed up calculations using graphics card with GeoLift

Is your feature request related to a problem? Please describe.

GeoLiftMarketSelection takes a long time to run specially when increasing lookback_window and treatment_period values. Is it possible to speed up these calculations using a GPU?

Describe the solution you'd like

Would ideally like to specify device to speed up calculations on.

Describe alternatives you've considered

I have tried setting parallel_setup = "parallel" but this did not have a significant improvement in runtime.

Pulling out control / treatment output data

Is your feature request related to a problem? Please describe.

I am not sure how I can pull out control / treatment data produced by the model. Treatment I could easily recreate by just aggregating the geos, but treatment will not be a straight sum. I was trying to use the weights from summary() function but I couldn't make sense of it.

Describe the solution you'd like

Longer term: Would be great to have a function that just returns the data for treatment / control.
Short term: I would appreciate any suggestion on how I could manually pull the data.

Some potential bugs

Hi @NicolasMatrices-v2 @ArturoEsquerra,

Thank you so much to you and your team for writing this awesome package! It provides us another choice for geo test. I'm very interested in it and have spent some time learning it. These days, I've found two potential issues of this package. Please correct me if I'm wrong.

1st issue description

Source code of GeoLiftMarketSelection(), line 31 to 32:

N <- append(length(include_markets), N[length(include_markets) <= N])

The code above may generate the length(include_markets) element twice. For instance, N=c(1,2,3,4,5), length(include_markets)=2. It may slow down the computation of GeoLiftMarketSelection() considering it may add another round of for loop.

I think we can use N <- append(length(include_markets), N[length(include_markets) < N]) instead.

2nd issue description

Source code of GeoLiftMarketSelection(), line 50 to 55:

  if (min(effect_size < 0 & max(effect_size) > 0)) {
    message(paste0("Error: The specified simulated effect sizes are not all of the same ", 
      " sign. \nTry again with a vector of all positive or negative effects", 
      " sizes that includes zero."))
    return(NULL)
  }

The code above seems not very meaningful for me. If we want min(effect_size < 0 & max(effect_size) > 0) is true, we need both effect_size < 0 and max(effect_size) > 0 are true. However, if effect_size<0 is true, we can't expect the maximum value of effect_size is greater than 0.

Session information

image

Looking forward to your reply.

Regards,
Yuli

feature request: non-inferiority test hypothesis

Hi guys,

I'm opening this issue to ask if it would be possible to add a non-inferiority /equivalence option at the inference step.

Use Case:

  • a company has been running {marketing_campaign/channel] for the last N periods
  • hypothesis: switching the campaign/channel off would not impact the target metric negatively ( with a non-inferiority margin)

Context:
assuming that we are in a frequentist setting, the test could be set up either as a superiority or inferiority test.
Hence, if we don't find significant results we can't state that the test areas are 'non-inferior' or 'equivalent' to the counterfactual but we can only affirm that there is not enough evidence for stating 'superiority' or 'inferiority'.

Error when using tidyr 1.2.0

Hi team. Just tested with updated tidyr version (1.2.0) and GeoLiftPowerFinder() crashes with the following error:

248925481_638002024103542_8486967006594706359_n

I downgraded to 1.1.4 and it ran OK.

packageurl <- "http://cran.us.r-project.org/src/contrib/Archive/tidyr/tidyr_1.1.4.tar.gz"
install.packages(packageurl, repos=NULL, type="source")

step-wise exclusion not working as expected

Hi there,

I'm trying to understand why re-fitting GeoLiftMarketSelection doesn't yield the expected results: namely, I exclude the non-top performing locations (as determined by the first round of fitting), and expect the second step to run much faster and recover the same locations. Instead I get the following error: Error in 1:nrow(BestMarkets_aux) : argument of length 0.

Full script and output shown below:

Script

library(GeoLift)

print(sessionInfo())

# Set up data
df_geo = GeoDataRead(data=GeoLift_PreTest, date_id='date', location_id='location', Y_id = 'Y', X = c(), summary = TRUE, format = "yyyy-mm-dd")

# "baseline" parameters
treatment_periods = c(10, 20)
N_test = c(1)
effect_size = seq(0, 0.05, 0.01)
lookback_window = 2
locations = unique(df_geo$location)
n_locations = length(locations)
holdout = c(0.9, 1.0)
alpha = 0.05
model = "None"
fixed_effects = TRUE
Correlations = TRUE
parallel = TRUE
side_of_test = 'two_sided'

# (1) FIND BEST MARKETS
mkt_best = GeoLiftMarketSelection(data=df_geo, treatment_periods=treatment_periods, N=N_test, effect_size=effect_size, lookback_window=lookback_window, alpha=alpha, Correlations=Correlations, fixed_effects=fixed_effects, print=FALSE)
mkt_best = mkt_best$BestMarket

# (2) EXCLUDE ALL MARKETS EXCEPT THE TOP 2
locations_best2 = unique(mkt_best[mkt_best$AvgATT>0,]$location)[1:2]
exclude = setdiff(locations, locations_best2)

# (3) RE-FIT WITH EXCLUSION
mkt_exclude = GeoLiftMarketSelection(data=df_geo, treatment_periods=treatment_periods, N=N_test, effect_size=effect_size, lookback_window=lookback_window, alpha=alpha, Correlations=Correlations, fixed_effects=fixed_effects, exclude_markets = exclude, print=FALSE)

Output

R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin21.3.0 (64-bit)
Running under: macOS Monterey 12.2.1

Matrix products: default
BLAS:   /usr/local/Cellar/openblas/0.3.20/lib/libopenblasp-r0.3.20.dylib
LAPACK: /usr/local/Cellar/r/4.1.2_1/lib/R/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] GeoLift_2.3.5

loaded via a namespace (and not attached):
 [1] augsynth_0.2.0         ggrepel_0.9.1          Rcpp_1.0.8.3          
 [4] mvtnorm_1.1-3          lattice_0.20-45        tidyr_1.1.4           
 [7] listenv_0.8.0          zoo_1.8-9              assertthat_0.2.1      
[10] digest_0.6.29          foreach_1.5.2          utf8_1.2.2            
[13] parallelly_1.30.0      lfe_2.8-7.1            R6_2.5.1              
[16] plyr_1.8.6             Boom_0.9.7             ggplot2_3.3.5         
[19] pillar_1.7.0           panelView_1.1.9        rlang_1.0.2           
[22] Matrix_1.3-4           stringr_1.4.0          BoomSpikeSlab_1.2.4   
[25] directlabels_2021.1.13 munsell_0.5.0          proxy_0.4-26          
[28] compiler_4.1.2         pkgconfig_2.0.3        globals_0.14.0        
[31] CausalImpact_1.2.7     tidyselect_1.1.2       gridExtra_2.3         
[34] tibble_3.1.6           quadprog_1.5-8         dtw_1.22-3            
[37] codetools_0.2-18       fansi_1.0.2            future_1.24.0         
[40] crayon_1.5.0           dplyr_1.0.8            MASS_7.3-54           
[43] grid_4.1.2             gsynth_1.2.1           xtable_1.8-4          
[46] gtable_0.3.0           lifecycle_1.0.1        magrittr_2.0.2        
[49] scales_1.1.1           cli_3.2.0              stringi_1.7.6         
[52] renv_0.15.4            reshape2_1.4.4         doRNG_1.8.2           
[55] doParallel_1.0.17      ellipsis_0.3.2         xts_0.12.1            
[58] generics_0.1.2         vctrs_0.3.8            sandwich_3.0-1        
[61] Formula_1.2-4          iterators_1.0.14       tools_4.1.2           
[64] glue_1.6.2             purrr_0.3.4            rngtools_1.5.2        
[67] MarketMatching_1.2.0   abind_1.4-5            parallel_4.1.2        
[70] colorspace_2.0-3       LowRankQP_1.0.4        bsts_0.9.7            
##################################
#####       Summary       #####
##################################

* Raw Number of Locations: 40
* Time Periods: 90
* Final Number of Locations (Complete): 40
Setting up cluster.
Importing functions into cluster.
Attempting to load the environment ‘package:dplyr’

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


Deterministic setup with 1 locations in treatment.
There were 50 or more warnings (use warnings() to see the first 50)
Setting up cluster.
Importing functions into cluster.

Deterministic setup with 1 locations in treatment.
Error in 1:nrow(BestMarkets_aux) : argument of length 0
Calls: GeoLiftMarketSelection
Execution halted

Error while running `GeoLiftPowerFinder`

Hey guys,

I've been trying out the package today and I am encountering an error when running the GeoLiftPowerFinder function

Context

  • weekly data
  • location_id are postcodes
  • result_search screenshot

Screenshot 2021-11-04 at 12 00 04

Code

resultsNum <- NumberLocations(data = GeoTestData_PreTest,
                              Y_id = "Y",
                              location_id = "location",
                              time_id = "time",
                              n_sim = 500,
                              treatment_periods = 6, #6 weeks
                              plot = TRUE,
                              power = 0.8,
                              alpha = 0.1,
                              fixed_effects = TRUE,
                              ProgressBar = TRUE)

#Zipcode test Selection
resultsSearch <- GeoLiftPower.search(data = GeoTestData_PreTest,
                                     treatment_periods = c(6),
                                     N = c(2),
                                     horizon = -50, #six weeks
                                     Y_id = "Y",
                                     location_id = "location",
                                     time_id = "time",
                                     top_results = 2,
                                     alpha = 0.1,
                                     type = "pValue",
                                     fixed_effects = TRUE,
                                     ProgressBar = TRUE)

head(resultsSearch,5)

#MDE
resultsFind <- GeoLiftPowerFinder(data = GeoTestData_PreTest,
                                  treatment_periods = c(6),
                                  N = c(2),
                                  Y_id = "Y",
                                  location_id = "location",
                                  time_id = "time",
                                  effect_size = seq(0.1),
                                  top_results = 1,
                                  alpha = 0.1,
                                  model = "Ridge",
                                  fixed_effects = TRUE,
                                  ProgressBar = TRUE,
                                  plot_best = TRUE)

Error encountered

changing the effect_size parameter

image

image

Add alpha to GLMS body.

Is your feature request related to a problem? Please describe.

The argument alpha in GeoLiftMarketSelection() is not used in the function body. I would like to know if you plan to use it as the threshold to decide the significant column or it will not be used in the future?

Describe the solution you'd like

Add alpha within the GLMS body.

Additional context

Thanks @yulisong

pvalueCalc appears to ignore model when covariates are supplied

Bug description

When covariates are supplied, GeoLiftMarketSelection() appears to return the same market selection results regardless of whether model="none" or model="Ridge".

I think I traced the source of this problem back to pvalueCalc() which appears to do the heavy-lifting under the hood:
In pre_test_power.R, GeoLiftMarketSelection() calls run_simulations() which calls pvalueCalc(). In lines 286-304, this pvalueCalc code

else if (length(X) > 0) {
    fmla <- as.formula(paste(
      "Y_inc ~ D |",
      sapply(list(X),
        paste,
        collapse = "+"
      )
    ))

    ascm_obj <- augsynth::augsynth(fmla,
      unit = location,
      time = time,
      data = data_aux,
      t_int = treatment_start_time,
      progfunc = "GSYN",
      scm = T,
      fixedeff = fixed_effects
    )
  }

appears to set progfunc = "GSYN" when covariates are supplied, regardless of what the user originally supplied in the GeoLiftMarketSelection() call. Why, for example, won't it permit progfunc = "Ridge" if supplied by the user?

Power is 100% regardless of effect size

I've tried varying the effect size in the example script below (1% and 0.01%), but the power is always returned as 100%. This does not seem possible. Please advise on what might be causing the issue.

Rscript:

library(GeoLift)

print(sessionInfo())

# Set up data
df_geo = GeoDataRead(data=GeoLift_PreTest, date_id='date', location_id='location', Y_id = 'Y', X = c(), summary = TRUE, format = "yyyy-mm-dd")

# Optimal control estimate
treatment_periods = c(10)
N_test = c(1)
lookback_window = 1
cpic = 100
alpha = 0.05
model = "None"
fixed_effects = TRUE
Correlations = TRUE
parallel = TRUE
side_of_test = 'one_sided'
budget = NULL

cn_power = c('location', 'EffectSize', 'Power')
# Effect size of 1%
sel1 = GeoLiftMarketSelection(data=df_geo, treatment_periods=treatment_periods, N=N_test, effect_size=0.01, lookback_window=lookback_window, cpic=cpic, alpha=alpha, Correlations=Correlations, fixed_effects=fixed_effects)
print(sel1$BestMarkets[, cn_power])

# Effect size of 0.01%
sel2 = GeoLiftMarketSelection(data=df_geo, treatment_periods=treatment_periods, N=N_test, effect_size=0.0001, lookback_window=lookback_window, cpic=cpic, alpha=alpha, Correlations=Correlations, fixed_effects=fixed_effects)
print(sel2$BestMarkets[, cn_power])

print(t(rbind(sel1$BestMarkets, sel2$BestMarkets)))

output:

R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin21.3.0 (64-bit)
Running under: macOS Monterey 12.2.1

Matrix products: default
BLAS:   /usr/local/Cellar/openblas/0.3.20/lib/libopenblasp-r0.3.20.dylib
LAPACK: /usr/local/Cellar/r/4.1.2_1/lib/R/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] GeoLift_2.3.5

loaded via a namespace (and not attached):
 [1] augsynth_0.2.0         ggrepel_0.9.1          Rcpp_1.0.8.3          
 [4] mvtnorm_1.1-3          lattice_0.20-45        tidyr_1.1.4           
 [7] listenv_0.8.0          zoo_1.8-9              assertthat_0.2.1      
[10] digest_0.6.29          foreach_1.5.2          utf8_1.2.2            
[13] parallelly_1.30.0      lfe_2.8-7.1            R6_2.5.1              
[16] plyr_1.8.6             Boom_0.9.7             ggplot2_3.3.5         
[19] pillar_1.7.0           panelView_1.1.9        rlang_1.0.2           
[22] Matrix_1.3-4           stringr_1.4.0          BoomSpikeSlab_1.2.4   
[25] directlabels_2021.1.13 munsell_0.5.0          proxy_0.4-26          
[28] compiler_4.1.2         pkgconfig_2.0.3        globals_0.14.0        
[31] CausalImpact_1.2.7     tidyselect_1.1.2       gridExtra_2.3         
[34] tibble_3.1.6           quadprog_1.5-8         dtw_1.22-3            
[37] codetools_0.2-18       fansi_1.0.2            future_1.24.0         
[40] crayon_1.5.0           dplyr_1.0.8            MASS_7.3-54           
[43] grid_4.1.2             gsynth_1.2.1           xtable_1.8-4          
[46] gtable_0.3.0           lifecycle_1.0.1        magrittr_2.0.2        
[49] scales_1.1.1           cli_3.2.0              stringi_1.7.6         
[52] renv_0.15.4            reshape2_1.4.4         doRNG_1.8.2           
[55] doParallel_1.0.17      ellipsis_0.3.2         xts_0.12.1            
[58] generics_0.1.2         vctrs_0.3.8            sandwich_3.0-1        
[61] Formula_1.2-4          iterators_1.0.14       tools_4.1.2           
[64] glue_1.6.2             purrr_0.3.4            rngtools_1.5.2        
[67] MarketMatching_1.2.0   abind_1.4-5            parallel_4.1.2        
[70] colorspace_2.0-3       LowRankQP_1.0.4        bsts_0.9.7            
##################################
#####       Summary       #####
##################################

* Raw Number of Locations: 40
* Time Periods: 90
* Final Number of Locations (Complete): 40
Setting up cluster.
Importing functions into cluster.
Attempting to load the environment ‘package:dplyr’

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


Deterministic setup with 1 locations in treatment.
  ID      location duration EffectSize Power AvgScaledL2Imbalance Investment
1  1       oakland       10       0.01     1            0.3859408     138654
2  2 san francisco       10       0.01     1            0.3705911      91328
     AvgATT Average_MDE ProportionTotal_Y abs_lift_in_zero   Holdout rank
1  2006.245   0.1672176        0.07663388            0.157 0.9233661    1
2 -1802.300  -0.1634528        0.05985532            0.173 0.9401447    2
  correlation
1   0.9542857
2   0.9346028
There were 50 or more warnings (use warnings() to see the first 50)
       location EffectSize Power
1       oakland       0.01     1
2 san francisco       0.01     1
Setting up cluster.
Importing functions into cluster.

Deterministic setup with 1 locations in treatment.
  ID      location duration EffectSize Power AvgScaledL2Imbalance Investment
1  1       oakland       10      1e-04     1            0.3859408    1386.54
2  2 san francisco       10      1e-04     1            0.3705911     913.28
     AvgATT Average_MDE ProportionTotal_Y abs_lift_in_zero   Holdout rank
1  1868.978   0.1557766        0.07663388            0.156 0.9233661    1
2 -1892.715  -0.1716526        0.05985532            0.172 0.9401447    2
  correlation
1   0.9542857
2   0.9346028
There were 50 or more warnings (use warnings() to see the first 50)
       location EffectSize Power
1       oakland      1e-04     1
2 san francisco      1e-04     1
                     [,1]         [,2]            [,3]         [,4]           
ID                   "1"          "2"             "1"          "2"            
location             "oakland"    "san francisco" "oakland"    "san francisco"
duration             "10"         "10"            "10"         "10"           
EffectSize           "1e-02"      "1e-02"         "1e-04"      "1e-04"        
Power                "1"          "1"             "1"          "1"            
AvgScaledL2Imbalance "0.3859408"  "0.3705911"     "0.3859408"  "0.3705911"    
Investment           "138654.00"  " 91328.00"     "  1386.54"  "   913.28"    
AvgATT               " 2006.245"  "-1802.300"     " 1868.978"  "-1892.715"    
Average_MDE          " 0.1672176" "-0.1634528"    " 0.1557766" "-0.1716526"   
ProportionTotal_Y    "0.07663388" "0.05985532"    "0.07663388" "0.05985532"   
abs_lift_in_zero     "0.157"      "0.173"         "0.156"      "0.172"        
Holdout              "0.9233661"  "0.9401447"     "0.9233661"  "0.9401447"    
rank                 "1"          "2"             "1"          "2"            
correlation          "0.9542857"  "0.9346028"     "0.9542857"  "0.9346028" 

Ability to exclude locations from control

Is your feature request related to a problem? Please describe.

We're trying to use experimental data to validate GeoLift.

We've run a geo-test in which we've assigned half of a country's locations to control and the rest to treatment. Although we have already analysed and arrived at a conclusion using conventional methods, we're now trying to validate GeoLift with this data i.e. the objective is to see the causal effect that GeoLift estimates and how that differs from what we've found using other methods.

While we've been able to tell GeoLift which locations NOT to consider as potential treatments (via the exclude_markets parameter), the issue we're having is that the locations that (1) were actually treated and (2) are not selected as treatments might end up as part of the synthetic control. This is problematic - these locations were treated in reality so if they're not part of the treatment subset suggested by GeoLift, they shouldn't be part of the synthetic control either.

There doesn't seem to be a way to exclude certain locations from the control group. The include_markets and exclude_markets both seem to target the treatment group only.

Describe the solution you'd like

We've tried using the include_markets parameter in combination with the N parameter. Our hope was that we would be able to tell GeoLift to consider (as potential treatments) all locations that were actually treated in reality, but force it to provide a subset of the actual treatment group via the N parameter. We were hoping that this, in turn, would make GeoLift not consider the other treatment locations when building the synthetic control.

However, when trying this, GeoLift raises the following error Error: More forced markets than total test ones. Consider increasing the values of N.

Describe alternatives you've considered

We can't think of any other way of achieving this.

Additional context

I think it should be clear but I'll be happy to provide any further details if you need them.

2 Locations instead of 400 and deterministic error

  • I started creating a Geolift for one client and instead of 372 locations, I have just 2 Locations.
  • Then, when I try to find the best markets this is the error:

> - Deterministic setup with 2 locations in treatment.
> Errore in matrix(simulation_results, nrow = length(names(simulation_results))) : 
>   'data' dev'essere di tipo vector, era 'NULL'
> In aggiunta: Messaggio di avvertimento:
> In mclapply(argsList, FUN, mc.preschedule = preschedule, mc.set.seed = set.seed,  :
>   i core pianificati 1, 2 non hanno distribuito risultati, tutti i valori 
<img width="601" alt="Schermata 2022-03-26 alle 18 24 47" src="https://user-images.githubusercontent.com/102374311/160250562-93de15c4-d45b-457c-80c1-17ad516d091a.png">
dei job saranno interessati

Can anybody help me?

Speed up the GeoLiftMarketSelection function

Hi @NicolasMatrices-v2 @ArturoEsquerra, sorry that I'm coming to add your work load again.

As @sohammarin suggest, I've also met the problem that the function is relatively slow. I've tried to profile GeoLiftMarketSelection() function using the profvis package and realized that a huge amount of time is spent on the GetCorrel() function. I've check further and realized that GetCorrel() function has done some unnecessary work to calculate the correlation score. It will be much faster to use the cor() function in base package directly instead of calling the MarketMatching::best_matches() function. And I think it might be the easiest part to change and the most obvious part to improve the computation efficiency.

Regards,
Yuli

Setting the CPIC Best Practice

Hi guys

I was thinking about what would be the best practice when setting the cpic parameter when you are "launching" a channel for the first time

I wanted to publicly share my thinking with the GeoLift community and get feedback both from the team and the practitioners.

Context
A company wants to launch the Facebook Ads Channel from scratch with the objective to increase the number of customers (customer acquisition focus)

Question: what would be the optimal value for cpic

Answer: CPIC = LTV. You want to keep acquiring customers until the marginal cost of the next customer equals the marginal utility (LTV).

Operational Implication: you need to use a bid cap strategy in the Facebook Business Manager optimizing for the Customer Created conversion and set it slightly lower to cover some slippage that may happen in the auctions, especially with low volumes campaigns.

@ArturoEsquerra @NicolasMatrices-v2 @michael-khalil WDYT?

Plot Error

Hi there,
where I try running plot function to find out what the results of the GeoLift() model would look like with the latest possible test period as well as the test’s power curve across all simulations.

I have this error.

Can anybody help me?

plot(MarketSelections, market_ID = 3, print_summary = FALSE)
Messaggio di avvertimento:
Removed 26 rows containing missing values (geom_smooth).

# of geos in treatment group and effect size

Bug description

  • How do I decide how many geos I should have in my control and treatment group? If I have 20 regions I could put 2 or 10 or 15 regions in the treatment group, I could I decide?

  • How do I decide the effect size I want? I don't know which is the lift of my campaign and I could be wrong to say (0,0.25,0.50)


## Additional context
  Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Test Geos pre selected

We will have a geo targeted campaign in place in 6 cities. In developing a measurement plan we thought of using geo lift as a tool. However, since test markets are already chosen I have 2 Qs:

  1. Can test markets be pre selected and control selected using geolift?
  2. Can there be more than 1 test markets. If yes, how to operationalize it.
    Thanks.

Issue: Selecting Geos

Hi there,
I'm having some issues selecting the number of geos to include in the test group.

This is my script:

resultsNum <- NumberLocations(data = GeoTestData_PreTest, Y_id = "Y", location_id = "location", time_id = "time", n_sim = 500, treatment_periods = 15,# test lenght will be 15 days, if you are not sure enter a list of possible values plot = TRUE, power = 0.8, alpha = 0.1, fixed_effects = TRUE, ProgressBar = TRUE)

Each time I try running this scripts I was faced with this error:

task 1 failed - "Problem while computing `trt_time = replace_na(trt_time, Inf)`.
Caused by error in `stop_vctrs()`:
! Can't convert from `replace` <double> to `data` <integer> due to loss of precision.
• Locations: 1"

I'm not able to understand why it's happening
Can anybody help me?

Error in summary.connection(connection) : invalid connection

Hi, I've been trying to run the Geo Lift example code, but having issues (parallel computing) with the following:

resultsSearch <- GeoLiftPower.search(data = GeoTestData_PreTest,
treatment_periods = c(15),
N = c(2,3,4),
horizon = 50,
Y_id = "Y",
location_id = "location",
time_id = "time",
top_results = 20,
alpha = 0.1,
type = "pValue",
fixed_effects = TRUE,
ProgressBar = TRUE)

Results in the following error:
Setting up cluster.
Importing functions into cluster.
Deterministic setup with 2 locations in treatment.
Error in summary.connection(connection) : invalid connection

I have tried it on a number of different machines but getting same error. Any ideas / solutions?

Errors message when running GeoLiftMarketSelection function

I ran the exactly same codes of GeoLiftMarketSelection from your demo and got the following 2 error messages.

MarketSelections <- GeoLiftMarketSelection(data = GeoTestData_PreTest,
treatment_periods = c(10,15),
N = c(2,3,4,5),
Y_id = "Y",
location_id = "location",
time_id = "time",
effect_size = seq(0, 0.5, 0.05),
lookback_window = 1,
include_markets = c("chicago"),
exclude_markets = c("honolulu"),
holdout = c(0.5, 1),
cpic = 7.50,
budget = 100000,
alpha = 0.1,
Correlations = TRUE,
fixed_effects = TRUE)

Error (1) When the Correlations = TRUE is included, then the error message is:

Error in GeoLiftMarketSelection(data = GeoTestData_PreTest, treatment_periods = c(10, :
unused argument (Correlations = TRUE)

Error (2) If Correlations = TRUE is commented out, then the following error message.

Error in rbind(temp_Markets, BestMarkets_aux[row, ]) :
object 'temp_Markets' not found

Please advice how to fix the errors.

Thanks,

include/exclude markets issue

I am trying to using the include/exclude markets arguments, but this is leading to different errors. Please advise on any solutions or reasons for the issue. The script below shows that what I am running through Rscript:

# Exclude markets issue
library(GeoLift)

print(sessionInfo())

# Set up data
df_geo = GeoDataRead(data=GeoLift_PreTest, date_id='date', location_id='location', Y_id = 'Y', X = c(), summary = TRUE, format = "yyyy-mm-dd")

# Optimal control estimate
treatment_periods = c(10, 15)
N_test = c(2)
effect_size = seq(0, 0.05, 0.005)
lookback_window = 1
locations = unique(df_geo$location)
n_locations = length(locations)
include_markets = locations[1]
exclude_markets = locations[11:n_locations]
holdout = c(0.9, 1.0)
cpic = 100
alpha = 0.05
model = "None"
fixed_effects = TRUE
Correlations = TRUE
parallel = TRUE
side_of_test = 'one_sided'
budget = NULL

# Include markets only
sel1a = GeoLiftMarketSelection(data=df_geo, treatment_periods=treatment_periods, N=N_test, effect_size=effect_size, lookback_window=lookback_window, cpic=cpic, alpha=alpha, Correlations=Correlations, fixed_effects=fixed_effects, include_markets=include_markets)
print(t(sel1a$BestMarkets))

# Lower effect size
sel1b = GeoLiftMarketSelection(data=df_geo, treatment_periods=treatment_periods, N=N_test, effect_size=seq(0, 0.01, 0.005), lookback_window=lookback_window, cpic=cpic, alpha=alpha, Correlations=Correlations, fixed_effects=fixed_effects, include_markets=include_markets)
print(t(sel1b$BestMarkets))


# Exclude markets only
sel2 = GeoLiftMarketSelection(data=df_geo, treatment_periods=treatment_periods, N=N_test, effect_size=effect_size, lookback_window=lookback_window, cpic=cpic, alpha=alpha, Correlations=Correlations, fixed_effects=fixed_effects, exclude_markets=exclude_markets)
print(t(sel2$BestMarkets))

This is the output from my console:

R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin21.3.0 (64-bit)
Running under: macOS Monterey 12.2.1

Matrix products: default
BLAS:   /usr/local/Cellar/openblas/0.3.20/lib/libopenblasp-r0.3.20.dylib
LAPACK: /usr/local/Cellar/r/4.1.2_1/lib/R/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] GeoLift_2.3.5

loaded via a namespace (and not attached):
 [1] augsynth_0.2.0         ggrepel_0.9.1          Rcpp_1.0.8.3          
 [4] mvtnorm_1.1-3          lattice_0.20-45        tidyr_1.1.4           
 [7] listenv_0.8.0          zoo_1.8-9              assertthat_0.2.1      
[10] digest_0.6.29          foreach_1.5.2          utf8_1.2.2            
[13] parallelly_1.30.0      lfe_2.8-7.1            R6_2.5.1              
[16] plyr_1.8.6             Boom_0.9.7             ggplot2_3.3.5         
[19] pillar_1.7.0           panelView_1.1.9        rlang_1.0.2           
[22] Matrix_1.3-4           stringr_1.4.0          BoomSpikeSlab_1.2.4   
[25] directlabels_2021.1.13 munsell_0.5.0          proxy_0.4-26          
[28] compiler_4.1.2         pkgconfig_2.0.3        globals_0.14.0        
[31] CausalImpact_1.2.7     tidyselect_1.1.2       gridExtra_2.3         
[34] tibble_3.1.6           quadprog_1.5-8         dtw_1.22-3            
[37] codetools_0.2-18       fansi_1.0.2            future_1.24.0         
[40] crayon_1.5.0           dplyr_1.0.8            MASS_7.3-54           
[43] grid_4.1.2             gsynth_1.2.1           xtable_1.8-4          
[46] gtable_0.3.0           lifecycle_1.0.1        magrittr_2.0.2        
[49] scales_1.1.1           cli_3.2.0              stringi_1.7.6         
[52] renv_0.15.4            reshape2_1.4.4         doRNG_1.8.2           
[55] doParallel_1.0.17      ellipsis_0.3.2         xts_0.12.1            
[58] generics_0.1.2         vctrs_0.3.8            sandwich_3.0-1        
[61] Formula_1.2-4          iterators_1.0.14       tools_4.1.2           
[64] glue_1.6.2             purrr_0.3.4            rngtools_1.5.2        
[67] MarketMatching_1.2.0   abind_1.4-5            parallel_4.1.2        
[70] colorspace_2.0-3       LowRankQP_1.0.4        bsts_0.9.7            
##################################
#####       Summary       #####
##################################

* Raw Number of Locations: 40
* Time Periods: 90
* Final Number of Locations (Complete): 40
Setting up cluster.
Importing functions into cluster.
Attempting to load the environmentpackage:dplyrAttaching package:dplyrThe following objects are masked frompackage:stats:

    filter, lag

The following objects are masked frompackage:base:

    intersect, setdiff, setequal, union


Deterministic setup with 2 locations in treatment.
  ID      location duration EffectSize Power AvgScaledL2Imbalance Investment
1  1 atlanta, reno       15       0.05     1            0.4729477     556510
    AvgATT Average_MDE ProportionTotal_Y abs_lift_in_zero   Holdout rank
1 170.7623   0.0458446         0.0434787            0.004 0.9565213    1
  correlation
1   0.9595179
Warning messages:
1: In max(ifelse(resultsFindAux$EffectSize < 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to max; returning -Inf
2: In min(ifelse(resultsFindAux$EffectSize > 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to min; returning Inf
3: In max(ifelse(resultsFindAux$EffectSize < 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to max; returning -Inf
4: In min(ifelse(resultsFindAux$EffectSize > 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to min; returning Inf
5: In max(ifelse(resultsFindAux$EffectSize < 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to max; returning -Inf
6: In min(ifelse(resultsFindAux$EffectSize > 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to min; returning Inf
                     [,1]           
ID                   "1"            
location             "atlanta, reno"
duration             "15"           
EffectSize           "0.05"         
Power                "1"            
AvgScaledL2Imbalance "0.4729477"    
Investment           "556510"       
AvgATT               "170.7623"     
Average_MDE          "0.0458446"    
ProportionTotal_Y    "0.0434787"    
abs_lift_in_zero     "0.004"        
Holdout              "0.9565213"    
rank                 "1"            
correlation          "0.9595179"    
Setting up cluster.
Importing functions into cluster.

Deterministic setup with 2 locations in treatment.
Error in `dplyr::filter()`:
! Problem while computing `..1 = location %in%
  resultsM$Locs[[row]]`.
Caused by error in `resultsM$Locs[[row]]`:
! subscript out of bounds
Backtrace:1. ├─GeoLift::GeoLiftMarketSelection(...)
  2. │ └─... %>% dplyr::summarize(total = sum(Total_Y))
  3. ├─dplyr::summarize(., total = sum(Total_Y))
  4. ├─dplyr::filter(., location %in% resultsM$Locs[[row]])
  5. ├─dplyr:::filter.data.frame(., location %in% resultsM$Locs[[row]])
  6. │ └─dplyr:::filter_rows(.data, ..., caller_env = caller_env())
  7. │   └─dplyr:::filter_eval(dots, mask = mask, error_call = error_call)
  8. │     ├─base::withCallingHandlers(...)
  9. │     └─mask$eval_all_filter(dots, env_filter)
 10. ├─location %in% resultsM$Locs[[row]]
 11. └─base::.handleSimpleError(`<fn>`, "subscript out of bounds", base::quote(resultsM$Locs[[row]]))
 12.   └─dplyr h(simpleError(msg, call))
 13.     └─rlang::abort(bullets, call = error_call, parent = skip_internal_condition(e))
Warning messages:
1: In max(ifelse(resultsFindAux$EffectSize < 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to max; returning -Inf
2: In min(ifelse(resultsFindAux$EffectSize > 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to min; returning Inf
3: In max(ifelse(resultsFindAux$EffectSize < 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to max; returning -Inf
4: In min(ifelse(resultsFindAux$EffectSize > 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to min; returning Inf
5: In max(ifelse(resultsFindAux$EffectSize < 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to max; returning -Inf
6: In min(ifelse(resultsFindAux$EffectSize > 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to min; returning Inf
7: In max(ifelse(resultsFindAux$EffectSize < 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to max; returning -Inf
8: In min(ifelse(resultsFindAux$EffectSize > 0, resultsFindAux$EffectSize,  :
  no non-missing arguments to min; returning Inf
Execution halted

How to decide the CPIC value?

Bug description

I am running the Geoift model on my data. I have the initial data input of data, location, and sales. How do I know what is the CPIC value?

Understanding the output

Bug description

I have been playing around with Geolift. When CPIC is 500 and budget is 10,000 - We get result-A. When CPIC is 500 and budget is 1- We get Result B. When CPIC is 500 and budget is 1000 - We get Result C. How do we interpet these results?

Session information

GeoLift Results Summary

##################################

Test Statistics (Result A )#####

##################################

  • Average ATT: 0.096
  • Percent Lift: 5%
  • Incremental Y: 4
  • P-value: 0.03

##################################

Balance Statistics

##################################

  • L2 Imbalance: 8.589
  • Scaled L2 Imbalance: 0.8488
  • Percent improvement from naive model: 15.12%
  • Average Estimated Bias: NA

##################################

GeoLift Simulation

Simulating: 5% Lift

##################################

GeoLift Results Summary

##################################

Test Statistics Result-B

##################################

  • Average ATT: 0.096
  • Percent Lift: 5%
  • Incremental Y: 4
  • P-value: 0.03

##################################

Balance Statistics

##################################

  • L2 Imbalance: 8.589
  • Scaled L2 Imbalance: 0.8488
  • Percent improvement from naive model: 15.12%
  • Average Estimated Bias: NA

##################################

Test Statistics (Result-C)

##################################

  • Average ATT: 0.077
  • Percent Lift: 6.3%
  • Incremental Y: 2
  • P-value: 0.07

##################################

Balance Statistics

##################################

  • L2 Imbalance: 7.813
  • Scaled L2 Imbalance: 0.8486
  • Percent improvement from naive model: 15.14%
  • Average Estimated Bias: NA

Reproduction steps

Enter steps to reproduce the behavior.

Expected behavior

The Lift is same when Budget is 10,000 and 1. It changes when budget is 1

Output goes here

Additional context

Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Get confidence intervals

Is your feature request related to a problem? Please describe.

Aside from p-value, it would be great to also get some confidence intervals of the lift measured (such as upper/lower bounds at 95% confidence for example) as a part of summary.

Describe alternatives you've considered

I could calculate it manually but it would be much nicer to just have a function for this. In the meantime - would you happen to know a formula in R that could do the calculation for me?

Investment and Lift is daily or total or per region in MarketSelection

Bug description

When I see investment for market_id 2 equals to 10k, does it mean 10k overall or 10k per day? or per region?

And for the ATT?

Session information

Please paste the output after running sessionInfo() in your R session.

Reproduction steps

Enter steps to reproduce the behavior.

Expected behavior

Give a clear and concise description of what you expected to happen.

Output goes here

Additional context

Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Cost per Incremental in a Deprivation Testing

I've already read all your posts about deprivation tests with GeoLift.

What I cannot understand is: how can I analyze the CPIC? Because if I turn off 3-4 regions or exclude them from my existing campaign, I don't know how much budget I would spend on those 3-4 regions.

How can I find out what my CPIC would have been?

Other features in GeoLift MarketSelection and ROAS

## Bug description
Please enter a clear and concise description of what the bug is.

Session information

Please paste the output after running sessionInfo() in your R session.

Reproduction steps

Enter steps to reproduce the behavior.

Expected behavior

1- I've seen normalize, dtw, model and some other features in GeoLiftMarket Selection, how do they work?

2- Can you explain to me even why should I use CPIC to say "that's the avg ROAS I need in this experiment" if I work with sales?
I ask so because the formulas are so different:
The formula for ROAS is Revenue /Adv Spent
The formula for CPIC is Adv SPent/ Incremental Conversions

Schermata 2022-06-30 alle 00 55 33

Output goes here

Additional context

Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Some questions regarding function design

Hi @NicolasMatrices-v2 @ArturoEsquerra,

Hope you are well.

Sorry to bother you again. I'm trying to understand how you designed this package. I believe this can help me to have a better understanding of your great work. Currently, I've got several questions, and I would be grateful if you could give me some further details.

  1. I'm trying to understand the argument run_stochastic_process=TRUE in stochastic_market_selector(). I've already read the help doc but it's not very clear for me. Could you please explain its usage a little bit further?

  2. The argument alpha in GeoLiftMarketSelection() is not used in the function body. I would like to know if you plan to use it as the threshold to decide the significant column or it will not be used in the future?

  3. Currently, I'm trying to understand the argument lookback_window. I know that it's used to change the learning period and test period. However, I'm not sure when we need to use it. It would be great if you could provide me some hint on why it's introduced in GeoLiftMarketSelection().

  4. At the start of the GeoLiftMarketSelection(), we are trying to find the best matching market using MarketMatching::best_matches(). After that, we will use some of the best matching market as the control group. I feel a little bit confused here. From my intuitive, my short working experience and MarketMatching package, I assume we will use the best matching market as the control group and test group separately. Would you mind telling me how you consider about the selection of control/test group?

  5. When I run the demo code, I find that we will generate the same control group twice. From my humble opinion, we should run the same amount of the test for each unique control group. I was wondering if you could tell me how you decided here.

     [,1]         [,2]     
[1,] "atlanta"    "chicago"
[2,] "chicago"    "atlanta"
[3,] "cincinnati" "chicago"
[4,] "portland"   "chicago"
  1. Line 183 to 184 in the source code of GeoLiftMarketSelection() shows that we'll abandon one row if the absolute value of the budget is not greater than the absolute value of the investment. I'm very interested in why we use absolute value here.

I realize I've asked too many questions and appreciate any help you could provide.

I look forward to hearing from you.

Best regards,
Yuli

Build Failure

When trying to install the geolift package as a docker image the docker build completes but when trying to actually run the container and call my API geolift is not recognized. This isn’t happening for other packages downloaded with remotes/devtools. Any ideas why?

Installation issue on mac

Hi everyone, I try to install it from MacMonterey and it returned me this error. Please see below:

remotes::install_github("facebookincubator/GeoLift")
Downloading GitHub repo facebookincubator/GeoLift@HEAD
Installing 2 packages: Boom, bsts
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.1/Boom_0.9.7.tgz'
Content type 'application/x-gzip' length 82018759 bytes (78.2 MB)
============
downloaded 19.0 MB

Error in download.file(url, destfile, method, mode = "wb", ...) : 
  download from 'https://cran.rstudio.com/bin/macosx/contrib/4.1/Boom_0.9.7.tgz' failed
In addition: Warning messages:
1: In download.file(url, destfile, method, mode = "wb", ...) :
  downloaded length 19902342 != reported length 82018759
2: In download.file(url, destfile, method, mode = "wb", ...) :
  URL 'https://cran.rstudio.com/bin/macosx/contrib/4.1/Boom_0.9.7.tgz': Timeout of 60 seconds was reached
Warning in download.packages(pkgs, destdir = tmpd, available = available,  :
  download of package ‘Boom’ failed
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.1/bsts_0.9.7.tgz'
Content type 'application/x-gzip' length 22736562 bytes (21.7 MB)
==============================================
downloaded 20.3 MB

Error in download.file(url, destfile, method, mode = "wb", ...) : 
  download from 'https://cran.rstudio.com/bin/macosx/contrib/4.1/bsts_0.9.7.tgz' failed
In addition: Warning messages:
1: In download.file(url, destfile, method, mode = "wb", ...) :
  downloaded length 21297490 != reported length 22736562
2: In download.file(url, destfile, method, mode = "wb", ...) :
  URL 'https://cran.rstudio.com/bin/macosx/contrib/4.1/bsts_0.9.7.tgz': Timeout of 60 seconds was reached
Warning in download.packages(pkgs, destdir = tmpd, available = available,  :
  download of package ‘bsts’ failed
Running `R CMD build`...
* checking for file ‘/private/var/folders/93/2fmdks4d0vx3h40by9f3474w0000gn/T/Rtmp6Tx4YW/remotes36e649c6c0ce/facebookincubator-GeoLift-2415d3a/DESCRIPTION’ ... OK
* preparing ‘GeoLift’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
Removed empty directory ‘GeoLift/vignettes/GeoLift_Walkthrough_files’
* building ‘GeoLift_2.4.23.tar.gz’
* installing *source* package ‘GeoLift’ ...
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** byte-compile and prepare package for lazy loading
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : 
  there is no package called ‘Boom’
Calls: <Anonymous> ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Execution halted
ERROR: lazy loading failed for package ‘GeoLift’
* removing ‘/Library/Frameworks/R.framework/Versions/4.1/Resources/library/GeoLift’
Warning message:
In i.p(...) :
  installation of package ‘/var/folders/93/2fmdks4d0vx3h40by9f3474w0000gn/T//Rtmp6Tx4YW/file36e6179c3a39/GeoLift_2.4.23.tar.gz’ had non-zero exit status

Lift Understanding for control group

Bug description

The model has suggested me to run ads in 3 cities and have 'x' value as incremental Y for 15 days. If, the results are similar when the ads are run in those cities - what conclusion should I make? Should I then be running the same advertisement in my control group as well where initially the advertisements were not executed? Does, it mean if I run the same advertisement in the rest of the cities I would be getting a similar lift?

Session information

Reproduction steps

Enter steps to reproduce the behavior.

Expected behavior

Give a clear and concise description of what you expected to happen.

Output goes here

Additional context

Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Investment in function design

Bug description

How do you decide how much should I spend in this experiment?
The statistical significance will be different if I spend 1M or 4M, won't it?

MarketSelection tells me how much I should spend at a minimum for that experiment.
But how will it change if I spend 5k instead of 10? Will it be statistical significant or not?

I.e I decide to use a 10k budget for that experiment and MarketSelection suggests I spend 1k in those target markets.

How will it be if I spend 5k on that experiment instead?

Please enter a clear and concise description of what the bug is.

Session information

Please paste the output after running sessionInfo() in your R session.

Reproduction steps

Enter steps to reproduce the behavior.

Expected behavior

Give a clear and concise description of what you expected to happen.

Output goes here

Additional context

Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Errors in NumberLocations & GeoLiftPower - can't convert from double to integer

I followed the walkthrough and got this error below at the step of Power Analysis and Market Selection. Can someone help?

Attempting to load the environment ‘package:tidyr’
Error in { :
task 1 failed - "Problem while computing trt_time = replace_na(trt_time, Inf).
Caused by error in stop_vctrs():
! Can't convert from replace to data due to loss of precision.

Feature Request: Control of regions for test vs. control

If I want to exclude some regions from being considered for test membership, but I'm ok with them being in the synthetic control, can you provide a way to tell the system to exclude them from the test cell power and sizing analyses, but not exclude from the data altogether?

Multiple channels

What if there is more than one marketing channel? Can we use GeoLift to determine the lift for just one channel? Should we use total revenue in that case? How would the activities of the other channels affect the GeoLift?

Deprivation Testing with GeoLift

Bug description

Please enter a clear and concise description of what the bug is.

My goal is to understand what would happen if I pause my Brand campaign on Google

Session information

Please paste the output after running sessionInfo() in your R session.

I tried with GeoLiftMarket Selection but it has an effect size of 0.50 (shouldn't it be negative?)

I tried even with GeoLift but I don't know how to input data in locs etc.

Can you help me?

Reproduction steps

Enter steps to reproduce the behavior.

Expected behavior

Give a clear and concise description of what you expected to happen.

Output goes here

Additional context

Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Adv Spent in the past

Bug description

Could I set the amount of spending in the past to take into consideration this value for the investment?
Or how can I put the seasonality and other variables in covariates?

Session information

Please paste the output after running sessionInfo() in your R session.

Reproduction steps

Enter steps to reproduce the behavior.

Expected behavior

Give a clear and concise description of what you expected to happen.

Output goes here

Additional context

Add any other context about the problem here. (like proxy settings, network setup, overall goals, etc.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.