sditools / adobeanalyticsr Goto Github PK

View Code? Open in Web Editor NEW

18.0 18.0 8.0 18.15 MB

R Client for Adobe Analytics API v2.0

License: Other

R 100.00%

adobeanalyticsr's People

Contributors

Stargazers

Watchers

Forkers

ankitnagarsheth joseluisloren jalvarado charlie-gallagher lega0208 kultgenj ryanatsdi elalbaicin

adobeanalyticsr's Issues

aa_freeform_table() is limited to 6 dimensions

There is nothing inherent in the v2 API preventing aa_freeform_table() to infinitely recurse through dimensions. It would be really nice to have that ability. Even if there is a practical / time limit, simply having the function code such that it is fully dynamic would be preferred.

Support metric level segmentation

When comparing metrics across segments it would be good to be able to segment on the metric itself so you can pull different results for each metric.

Paid License Required?

Can you confirm a paid license is required in order to connect to the Adobe Analytics API when setting up the Adobe Console API Project here https://console.adobe.io/integrations?

If so, how do I purchase this subscription, is the subscription tied to my account or maybe the organization I'm working with?

Search Clause Documentation

I really want to make a Christmas joke right now but I'll refrain. Below is documentation from the 2.0 api docs

Using clause Parameters
As noted above, the search parameter also includes the clause option. The clause parameter provides a powerful tool for filtering data. To use it, follow these rules:

It uses boolean operators AND, OR, and NOT.
It uses operators MATCH, CONTAINS, BEGINS-WITH, and ENDS-WITH.
It uses group conditions with parenthesis.
Strings are contained in single quotes.
Searches are case-insensitive.
If no operator is specified, a 'contains' match is performed.
Valid operators are 'match' and 'contains'.
Glob expressions are evaluated. If a literal * is needed, use \*.
Example Clause Statements
Only include results that match the string 'home page': MATCH 'home page'
Include pages that do not contain 'home page': NOT CONTAINS 'home page'
Include pages that do not contain 'home page' or 'about us', but do contain 'contact us': (NOT CONTAINS 'home page' OR NOT CONTAINS 'about us') AND (CONTAINS 'contact us')
Include pages that contain 'home page' or start with 'landing': CONTAINS 'home page' OR BEGINS-WITH 'landing'

https://github.com/AdobeDocs/analytics-2.0-apis/blob/master/reporting-guide.md#using-clause-parameters

search and segmentId attributes error handling

We need a better way of handling the error handling for additional/missing/incorrect search and segmentIds.

Data mismatch between frontend and API return

Code: df <- aa_freeform_report(company_id = company_id, rsid = rsid_usp, date_range = Total_period, dimensions = c('prop2', 'daterangeday', 'prop1', 'lasttouchchannel'), metrics = 'visitors', segmentId = 's300008117_5dceed9bc945642acd98d070', top = top_values) [segment is 'group| --- side of brand' in the front end]

This is a hit level segment and so should be pulling only that particular side of brand data in which is what is happening when I create a freeform table in the front end with that segment applied. However the data pull has prop2=unspecified also included.

Sample output:
Prop2 daterangeday prop1 lasttouchchannel visitors
1 Unspecified Oct 5, 2020 Unspecified None 13
2 Unspecified Oct 6, 2020 Unspecified None 49
3 Unspecified Oct 7, 2020 Unspecified None 50
4 Unspecified Oct 8, 2020 Unspecified None 32
5 Unspecified Oct 9, 2020 Unspecified None 32
7 --- (right) Oct 5, 2020 not brand aligned None 11
8 --- (right) Oct 6, 2020 not brand aligned None 47

Progress notation during call

message - instead of a print item so the reports do not need an edit in the knitr process.

Change default value for include_unspecified in aa_freeform_table() to 'TRUE'

It seems reasonable that users of the API are more likely to want to include Unspecified values in their results, which they can then choose to filter out after the fact.

aw_get_calculatedmetrics() - `favorite` shows `0` instead of `FALSE` for the default value

favorite is a Boolean. And, in general, 0 = FALSE and 1 = TRUE, but the parameter list shows that this is set to 0.

I actually tried setting it to 1 in a call, and the function returned all calculated metrics. I had to set it to TRUE to get that to work.

I think this is as simple as changing function definition to be favorite = FALSE, but I don't know if there are any downstream ramifications of that.

aw_getmetrics() -- change `expansion` to accept a vector

It seems a lot more natural to pass multiple values as a vector rather than as a comma-delimited string.

Current example:

expansion = "tags, categories"

Proposed update:

`expansion = c("tags", "categories")

Overall, this seems like a pretty dumb argument. I don't know what use case Adobe was imagining with this. But, I'm seeing this same thing on some other functions and will log separate issues there.

For now, I'm documenting against the current functionality, but, if we implement these enhancements, we'll need to update the documentation accordingly.

List of users

@benrwoodard Are there any plans for enhancements for providing a list of users when they access Adobe Analytics? Or is it not available via Adobe 2.0 API?

aw_token()

@benrwoodard When I try to get authorization token using below .I get error in the screenshot

aw_token()

Browser url = https://ims-na1.adobelogin.com/ims//authorize/v2?client_id=&scope=openid%2CAdobeID%2Cread_organizations%2Cadditional_info.projectedProductContext%2Cadditional_info.job_function&redirect_uri=https%3A%2F%2Fadobeanalyticsr.com%2Ftoken_result.html&response_type=code

A result of 1 breaks the function on >=2nd query

add "page" attribute to facilitate start at feature in pulling data

by adding page 2 you can get the start-at ability

aw_anomaly_report() function 403 error

Error in aw_call_data_debug("reports/ranked", body = req_body, company_id = company_id) : 
  Forbidden (HTTP 403).

The api url:

 {"rsid":"xxxxxx","globalFilters":[{"type":"dateRange","dateRange":"2020-11-01T00:00:00.000/2020-12-01T23:59:59.999"}],"metricContainer":{"metrics":[{"columnId":"0","id":"metrics/visits","filters":["0"]}],"metricFilters":[{"id":"0","type":"dateRange","dateRange":"2020-11-01T00:00:00.000/2020-12-01T23:59:59.999"}]},"dimension":"variables/daterangeday","settings":{"countRepeatInstances":true,"limit":30,"page":0,"dimensionSort":"asc","nonesBehavior":"return-nones","includeAnomalyDetection":true}}

"Simple" search support in aw_freeform_report

Add support for two "simple" use cases for the search argument in aw_freeform_report.

This is based on the premise that a common use case will be simply filtering for a single value.

Currently, to do the simplest of searches requires nesting a single quote within double quotes:

search = "'tablet'"

Add support for passing a simple string to search and then have that string be searched as a CONTAINS on the first dimension in the results:

So, search = "tablet" would actually execute search = "CONTAINS 'tablet'") (the CONTAINS isn't required in the API call, since that's the default).

Also, possibly (?) extend this support to a vector format if the user really wants a simple CONTAINS on multiple dimensions:

search = c("tablet", "search") would actually execute search = c("CONTAINS 'tablet'", "CONTAINS 'recliner'"))

`aw_get_calculatedmetrics()` - return comma-separated set of tags if `tags` is used in `expansion`

The tags argument for expansion in aw_get_calculatedmetrics() appears to return a list object in the tags column. Since many calculated metrics will have no tags or only a single tag, returning a commas-separated string with the tags seems like it would be cleaner to use, since this would, presumably, be returned primarily so that a subsequent call that uses tagnames would then be executed.

0 results getting queried/returned when a daterange dimension is in a non-1st position

This was identified when I was simply exploring the difference in runtimes from ordering my dimensions in a "smart" way versus in an "obvious" (but slower, based on the API workings) way. Both examples were intended to return "complete" data.

This was for a 30-day period.

df_fast <- aw_freeform_report(company_id = company_id,
                              rsid = rsid,
                              date_range = c(start_date, end_date),
                              dimensions = c("mobiledevicetype", "lasttouchchannel", "daterangeday"),
                              metrics = c("visits", "pageviews", "orders", "revenue"),
                              top = c(10, 30, 0))

df_slow <- aw_freeform_report(company_id = company_id,
                              rsid = rsid,
                              date_range = c(start_date, end_date),
                              dimensions = c("daterangeday", "lasttouchchannel", "mobiledevicetype"),
                              metrics = c("visits", "pageviews", "orders", "revenue"),
                              top = c(0, 30, 10))

As expected, putting daterangeday last was much faster (4X).

But, df_fast also had more rows than df_slow. Further investigation turned up that df_fast had a number of rows with "0" values for all of the metrics.

The totals (sum) for each metric were identical across the two data frames.

Below are some of the "all zeros:"

I think what is going on is that, for daterange values, the API ensures that it returns a value for every date increment in the date range. Theoretically, this could occur for a very low-incidence metric even if a daterange dimension is at the top level. But, if a daterange value is farther down—which makes for more efficient querying—it's more likely to occur.

Is this just a lower-priority something to document? (Or maybe Adobe has already documented it?

support segment dimension

example

{ "rsid": "xxxxx", "globalFilters": [ { "type": "segment", "segmentId": "s300006681_5fb4322d9bfec132bd9dbd73" }, { "type": "dateRange", "dateRange": "2020-10-01T00:00:00.000/2020-11-17T00:00:00.000" } ], "metricContainer": { "metrics": [ { "columnId": "metrics/orders:::0", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_1" ] }, { "columnId": "metrics/orders:::2", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_3" ] }, { "columnId": "metrics/orders:::4", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_5" ] }, { "columnId": "metrics/orders:::6", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_7" ] } ], "metricFilters": [ { "id": "STATIC_ROW_COMPONENT_1", "type": "segment", "segmentId": "s300006681_5fb436e8c89a963fe7b144fc" }, { "id": "STATIC_ROW_COMPONENT_3", "type": "segment", "segmentId": "s300006681_5fb436e88b03436b0b75d7d1" }, { "id": "STATIC_ROW_COMPONENT_5", "type": "segment", "segmentId": "s300006681_5fb436e87d03f65a10e92c56" }, { "id": "STATIC_ROW_COMPONENT_7", "type": "segment", "segmentId": "s300006681_5fb436e81212e663fcb463b3" } ] }, "settings": { "countRepeatInstances": true, "dimensionSort": "asc" }, "statistics": { "functions": [ "col-max", "col-min" ] } }

Add support for dynamically defining segments

Having the ability to fully defined a segment within R—rather than needing to define it in AW and then reference it by ID—would be useful.

The use case would be to be able to have a base segment and then swap out multiple "one change" values for it.

For instance, wanting to look at mobile traffic for each channel (in a way that simply using multiple dimensions wouldn't work) and being able to pull a list of channels and then have a function that combines each channel with "Mobile Phone" in a segment.

This would make for code that could be readily repurposed across different companies.

Convert arguments that accept a "comma-separated string" to accept a vector instead.

Currently, the following argument in aw_get_calculatedmetrics() will only include the tags column:

expansion = "tags, modified"

The following, however, will return both tags and modified:

expansion = "tags,modified"

This is because the space after the comma in the first example means that the second value does not get properly passed to the API.

I would actually prefer the following notation be the actual expected/accepted one:

expansion = c("tags", "modified")

But, at a minimum, using the existing notation but accounting for "comma-space" situations would potentially prevent frustration.

aw_get_calculatedmetrics() - change arguments that accept multiple values to take a vector

This is the same suggestion as #42 , but for the aw_getcalculatedmetrics() function. The following arguments seem like they would be more intuitive if they accepted a string (one value) or a vector (multiple values):

rsids
filterByIds
expansion

aw_freeform_table() - more detail if a request fails partway through

I got the following messages when running an aw_freeform_table() query:

Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
Request failed [429]. Retrying in 7 seconds...
A total of 1985 rows have been pulled.

There is no information about the failed request:

What a "429" failure means
Where in the process the failure happened (see #70 — that enhancement would help with this a bit)
Whether the retry was explicitly successful (vs. that query just got skipped?)

At a minimum, something like the following would be reassuring:

Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
Request failed [429]. Retrying in 7 seconds...
Retry successful. Continuing with additional requests...
A total of 1985 rows have been pulled.

aw_get_metrics(...segmentable = FALSE) returns all metrics (segmentable and not)

I don't know if this is an API issue (in which case we can just document it) or whether it's an issue with the function.

segmentable = TRUE returns just the metrics that can be used in segments.
segmentable = FALSE, though, returns all metrics. In theory, it should just return the handful that are not available in segments (which I don't know why anyone would ever want that): bounces, unique visitors, etc.

I had to go digging and then experimenting to even confirm what "segmentable" was. Including this link as the one reference I found: https://experienceleague.adobe.com/docs/analytics/analyze/analysis-workspace/workspace-faq/aw-limitations.html?lang=en#known-limitations-in-analysis-workspace

Also for reference, through experimentation, these are the metrics that I expected segmentable = FALSE to return:

averagepagedepth
averagetimespentonpage
averagetimespentonsite
averagevisitdepth
bouncerate
bounces
entries
exits
firsttouchchannel.5
firsttouchchannel.6
firsttouchchannel.7
firsttouchchannel.8
itemtimespent
mobileviews
occurrences
orderspervisit
pagesnotfound
pageviewspervisit
singlevaluevisits
timespentvisit
timespentvisitor
visitors
visitorsdaily
visitorshourly
visitorsmonthly
visitorsquarterly
visitorsweekly
visitorsyearly

daterangeweek auto top number

locate the position of the daterangeweek in the list of dimensions and check for if 0 and then change it with the number of weeks usinglength(seq(date_range, by = 'week'))

aa_freeform_table fails if include_unspecified=FALSE and unspecified values are returned

The current default for aa_freeform_table() is include_unspecified=FALSE. However, if a call is made where there are Unspecified values returned, the function fails.

aw_freeform_table() - provide periodic query count updates

A current example of the messages when running aw_freeform_table():

Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
A total of 1985 rows have been pulled.

So, there's some information provided up front: an estimated runtime and then the fact that the first data request has run. But...then there's nothing.

It would be nice to get periodic updates on progress—not necessarily every request, but, based on the estimated total requests, every request or every 5 requests or every 10 requests. So, in the example above, logic could be that there are an estimated 397 requests, and that's greater than 200, so an "every 50" increment would be used:

Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
50 of 398 possible data requests complete. Starting the next 348 requests.
100 of 398 possible data requests complete. Starting the next 298 requests.
150 of 398 possible data requests complete. Starting the next 248 requests.
200 of 398 possible data requests complete. Starting the next 198 requests.
250 of 398 possible data requests complete. Starting the next 148 requests.
300 of 398 possible data requests complete. Starting the next 98 requests.
350 of 398 possible data requests complete. Starting the next 48 requests.
398 of 398 possible data requests complete. Starting the next 397 requests.
A total of 1985 rows have been pulled.

This would need some additional thought. It also could be, "provide ~15 updates" so the total estimated would be divided by 15 and then that's the increment that would then be used.

Alternative, a "messages" argument could have options to include these notifications or not.

finalmnames error message without metric filters aw_workspace_report() function

When attempting to use the aw_workspace_report() function on a JSON file without 'metric filters' one receives the following error 'object 'finalmnames' not found'.

From a prioritization perspective this is likely low value as one could use aw_freeform_table() to successfully fetch the 'same' request.

Image of error message below. The text that is within the sample json file is within the text file below (can't upload json file here).

sampleRequest.txt

Include support for custom metrics

'metrics/' is not included when custom metrics are used in query

{"columnId": "5", "id": "metrics/event22" },
{"columnId": "6", "id": "cm300006896_5fac6262d1a4a8555835dc5c"}

aw_freeform_table() documentation suggestion - include details on where to get ids

In the documentation for aw_freeform_table(), for the arguments dimensions, metrics, segmentID - it may be helpful to specify where they can find the API field names which need to be used in the arguments. This may be especially needed for segmentIDs.

pretty name for dimensions and metrics

Using the id and title/name, create a list of pretty names for each dimension and metric in the call and then return back the names by names(dat) <- prettynames
Theoretically, this would add a few seconds to the function time but would be an additional attribute int he function so it could be turned on and off.
Call metrics
Call dimensions
Trim to id/name
join with expected final names and produce final pretty name list to use on final output if TRUE

aw_freeform_table(): sort by date if a daterange dimension is the first dimension

It's, technically, a ranked report if the dimension is "daterangeday" and the metric is, say, "visits," but the expectation (and behavior inside AW) is that the data will be sorted by date.

Currently, the above comes back sorted by the first metric which is confusing.

aw_get_calculatedmetric_byid() - cannot get this to work at all

May not be a bug, but I just can't get this to work.

The code:

aw_get_calculatedmetric_byid(company_id = company_id,
                                                  id = "cm300008117_5eb9b63009282640d73ca30b")

The id I pulled straight from an aw_get_calculatedmetrics() call. I tried several different values, both calculated metrics I owned and ones that I didn't, and I got a NULL result in all cases.

aw_get_calculatedmetrics() - `ownerId` causing errors

I can't get ownerId to work.

It's not clear from the documentation as to whether this should be using the owner.id or the owner.login value.

As an example, neither of these works:

Using the owner.id value

aw_get_calculatedmetrics(rsid = rsid,
                                company_id = company_id,
                                ownerId = "[added owner.id value from an unfiltered call]")

Using the owner.login value

aw_get_calculatedmetrics(rsid = rsid,
                                company_id = company_id,
                                ownerId = "added the owner.login value--email address--from a call w/ ownerFullName expansion]")

Both return:

Error in aw_call_api(req_path = urlstructure[1], company_id = company_id) : 
  Forbidden (HTTP 403).

The fact that I was trying to filter on values that I was getting from a less filtered version of the same function all means I do have access to that data.

get_users question

@benrwoodard Is it possible to get how many times the user logged in for the last 2 months' time frame? Does Adobe Analytics API 2.0 have that available?

aa_freeform_report() Segment global filter option

Add the segmentId filter to the function in the Global Function section

Consider converting "date" to a date data type when returned in a query

daterangeday comes back as a string from aa_freeform_report(). It would be nice if that got converted to be a date when it existed.

Should the auth token be called `aw.oauth`?

Since aa_token() was renamed to aw_token(), should the resulting token file be renamed from aa.oauth to aw.oauth?

`expansion` returning odd/incomplete results for `aw_get_calculatedmetrics()`

I made the following call:

aw_get_calculatedmetrics(rsid = rsid,
                                company_id = company_id)

I compared that to the results returned from this:

aw_get_calculatedmetrics(rsid = rsid,
                                company_id = company_id,
                                expansion = "reportSuiteName, ownerFullName, modified, tags, definition, compatability, categories")

There were only two additional columns added:

reportSuiteName -- that seems right/expected
siteTitle -- that...wasn't in the list of values

Any idea what might be going on here?

change function names to aw_verbelement

it's a minor thing but as you have been adding recommendations I've seen you utilize a naming convention that makes good sense. Using 'aw_' to prevent conflicts with other functions but then concatenate the action/verb and name of the element. example:
aw_get_calculatedmetrics to aw_getcalculatedmetrics
aw_get_metrics to aw_getmetrics
aw_get_dimensions to aw_getdimensions
...

Simplify where simplification will not remove recognition:
aw_freeform_report to aw_freeform
aw_anomaly_report to aw_anomaly

*where the expected result is a table of data, report is not needed.

What do you think?

aw_freeform_table() - provide actual runtime when query is complete

Current example of messages when running aw_freeform_table():

Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
A total of 1985 rows have been pulled.

Since queries will often get run multiple times with no or minimal adjustments, it would be nice to have one more line that reports how long the actual runtime was:

Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
A total of 1985 rows have been pulled.
Actual runtime: 284.3/ 4.73min.

206 error handling for metric name issues

Authorization Process Issue

When a user attempts to re-authenticate the OAuth token, the user receives a 401 error. The current process is to attempt a minor API call function such as get_me() and after getting the 401 error, then trying the aa_token() function again. Need to sure up the authentication process.

Segment Comparison Venn Diagram QV

This would be fairly straightforward as it would be limited to an expected 2 dimensions and return results in a single line of data that could be turned into a venn diagram of some kind along with the data for review.

Sample request:
{
"rsid": "ageo1xxpnwsdi2020prod",
"globalFilters": [
{
"type": "dateRange",
"dateRange": "2020-10-26T00:00:00.000/2020-11-25T00:00:00.000"
}
],
"metricContainer": {
"metrics": [
{
"columnId": "metrics/visitors:::0",
"id": "metrics/visitors",
"filters": [
"0"
]
},
{
"columnId": "metrics/visitors:::1",
"id": "metrics/visitors",
"filters": [
"1"
]
},
{
"columnId": "metrics/visitors:::2",
"id": "metrics/visitors",
"filters": [
"2",
"3"
]
}
],
"metricFilters": [
{
"id": "0",
"type": "segment",
"segmentId": "Mobile_Hits"
},
{
"id": "1",
"type": "segment",
"segmentId": "First_Time_Visits"
},
{
"id": "2",
"type": "segment",
"segmentId": "Mobile_Hits"
},
{
"id": "3",
"type": "segment",
"segmentId": "First_Time_Visits"
}
]
},
"settings": {
"countRepeatInstances": true
},
"statistics": {
"functions": [
"col-max",
"col-min"
]
}
}

`aw_get_calculatedmetrics()` - `tagnames` argument does not appear to be working.

I tried this on two separate companies. One only had one calculated metric with any tags, and it had two tags. The other had multiple metrics with a single tag.

The following is an example from the latter—GSK:

aw_get_calculatedmetrics(rsid = rsid,
                                company_id = company_id,
                                tagnames = "CXO")

In all cases, the tagnames filter had no impact on the output—the results were simply all calculated metrics.

expansion default to NA on all api calls

Change to default to NA to align with Adobe recommendations

Time handling in the freeform_report() function

scenarios

user includes 'daterangeday' in the dimension list of 3 dimensions and then adds 'top=(2, 5)' to the list for the limits expecting the data granularity to calculated.
User includes "timegranularity" attribute and expects all dimensions to be broken down by that type of granularity.
User adds 'daterange...." anywhere in the dimension list and expects the breakdown to result in all the corresponding rows as a result