sditools / adobeanalyticsr Goto Github PK
View Code? Open in Web Editor NEWR Client for Adobe Analytics API v2.0
License: Other
R Client for Adobe Analytics API v2.0
License: Other
There is nothing inherent in the v2 API preventing aa_freeform_table()
to infinitely recurse through dimensions. It would be really nice to have that ability. Even if there is a practical / time limit, simply having the function code such that it is fully dynamic would be preferred.
Can you confirm a paid license is required in order to connect to the Adobe Analytics API when setting up the Adobe Console API Project here https://console.adobe.io/integrations?
If so, how do I purchase this subscription, is the subscription tied to my account or maybe the organization I'm working with?
I really want to make a Christmas joke right now but I'll refrain. Below is documentation from the 2.0 api docs
Using clause Parameters
As noted above, the search parameter also includes the clause option. The clause parameter provides a powerful tool for filtering data. To use it, follow these rules:
It uses boolean operators AND, OR, and NOT.
It uses operators MATCH, CONTAINS, BEGINS-WITH, and ENDS-WITH.
It uses group conditions with parenthesis.
Strings are contained in single quotes.
Searches are case-insensitive.
If no operator is specified, a 'contains' match is performed.
Valid operators are 'match' and 'contains'.
Glob expressions are evaluated. If a literal * is needed, use \*.
Example Clause Statements
Only include results that match the string 'home page': MATCH 'home page'
Include pages that do not contain 'home page': NOT CONTAINS 'home page'
Include pages that do not contain 'home page' or 'about us', but do contain 'contact us': (NOT CONTAINS 'home page' OR NOT CONTAINS 'about us') AND (CONTAINS 'contact us')
Include pages that contain 'home page' or start with 'landing': CONTAINS 'home page' OR BEGINS-WITH 'landing'
We need a better way of handling the error handling for additional/missing/incorrect search and segmentIds.
Code: df <- aa_freeform_report(company_id = company_id, rsid = rsid_usp, date_range = Total_period, dimensions = c('prop2', 'daterangeday', 'prop1', 'lasttouchchannel'), metrics = 'visitors', segmentId = 's300008117_5dceed9bc945642acd98d070', top = top_values) [segment is 'group| --- side of brand' in the front end]
This is a hit level segment and so should be pulling only that particular side of brand data in which is what is happening when I create a freeform table in the front end with that segment applied. However the data pull has prop2=unspecified also included.
Sample output:
Prop2 daterangeday prop1 lasttouchchannel visitors
1 Unspecified Oct 5, 2020 Unspecified None 13
2 Unspecified Oct 6, 2020 Unspecified None 49
3 Unspecified Oct 7, 2020 Unspecified None 50
4 Unspecified Oct 8, 2020 Unspecified None 32
5 Unspecified Oct 9, 2020 Unspecified None 32
7 --- (right) Oct 5, 2020 not brand aligned None 11
8 --- (right) Oct 6, 2020 not brand aligned None 47
message - instead of a print item so the reports do not need an edit in the knitr process.
It seems reasonable that users of the API are more likely to want to include Unspecified values in their results, which they can then choose to filter out after the fact.
favorite
is a Boolean. And, in general, 0 = FALSE
and 1 = TRUE
, but the parameter list shows that this is set to 0
.
I actually tried setting it to 1
in a call, and the function returned all calculated metrics. I had to set it to TRUE
to get that to work.
I think this is as simple as changing function definition to be favorite = FALSE
, but I don't know if there are any downstream ramifications of that.
It seems a lot more natural to pass multiple values as a vector rather than as a comma-delimited string.
Current example:
expansion = "tags, categories"
Proposed update:
`expansion = c("tags", "categories")
Overall, this seems like a pretty dumb argument. I don't know what use case Adobe was imagining with this. But, I'm seeing this same thing on some other functions and will log separate issues there.
For now, I'm documenting against the current functionality, but, if we implement these enhancements, we'll need to update the documentation accordingly.
@benrwoodard Are there any plans for enhancements for providing a list of users when they access Adobe Analytics? Or is it not available via Adobe 2.0 API?
@benrwoodard When I try to get authorization token using below .I get error in the screenshot
aw_token()
by adding page 2 you can get the start-at ability
Error in aw_call_data_debug("reports/ranked", body = req_body, company_id = company_id) :
Forbidden (HTTP 403).
The api url:
{"rsid":"xxxxxx","globalFilters":[{"type":"dateRange","dateRange":"2020-11-01T00:00:00.000/2020-12-01T23:59:59.999"}],"metricContainer":{"metrics":[{"columnId":"0","id":"metrics/visits","filters":["0"]}],"metricFilters":[{"id":"0","type":"dateRange","dateRange":"2020-11-01T00:00:00.000/2020-12-01T23:59:59.999"}]},"dimension":"variables/daterangeday","settings":{"countRepeatInstances":true,"limit":30,"page":0,"dimensionSort":"asc","nonesBehavior":"return-nones","includeAnomalyDetection":true}}
Add support for two "simple" use cases for the search
argument in aw_freeform_report
.
This is based on the premise that a common use case will be simply filtering for a single value.
Currently, to do the simplest of searches requires nesting a single quote within double quotes:
search = "'tablet'"
Add support for passing a simple string to search
and then have that string be searched as a CONTAINS
on the first dimension in the results:
So, search = "tablet"
would actually execute search = "CONTAINS 'tablet'")
(the CONTAINS
isn't required in the API call, since that's the default).
Also, possibly (?) extend this support to a vector format if the user really wants a simple CONTAINS on multiple dimensions:
search = c("tablet", "search")
would actually execute search = c("CONTAINS 'tablet'", "CONTAINS 'recliner'"))
The tags argument for expansion
in aw_get_calculatedmetrics()
appears to return a list object in the tags
column. Since many calculated metrics will have no tags or only a single tag, returning a commas-separated string with the tags seems like it would be cleaner to use, since this would, presumably, be returned primarily so that a subsequent call that uses tagnames
would then be executed.
This was identified when I was simply exploring the difference in runtimes from ordering my dimensions in a "smart" way versus in an "obvious" (but slower, based on the API workings) way. Both examples were intended to return "complete" data.
This was for a 30-day period.
df_fast <- aw_freeform_report(company_id = company_id,
rsid = rsid,
date_range = c(start_date, end_date),
dimensions = c("mobiledevicetype", "lasttouchchannel", "daterangeday"),
metrics = c("visits", "pageviews", "orders", "revenue"),
top = c(10, 30, 0))
df_slow <- aw_freeform_report(company_id = company_id,
rsid = rsid,
date_range = c(start_date, end_date),
dimensions = c("daterangeday", "lasttouchchannel", "mobiledevicetype"),
metrics = c("visits", "pageviews", "orders", "revenue"),
top = c(0, 30, 10))
As expected, putting daterangeday
last was much faster (4X).
But, df_fast
also had more rows than df_slow
. Further investigation turned up that df_fast
had a number of rows with "0" values for all of the metrics.
The totals (sum) for each metric were identical across the two data frames.
Below are some of the "all zeros:"
I think what is going on is that, for daterange
values, the API ensures that it returns a value for every date increment in the date range. Theoretically, this could occur for a very low-incidence metric even if a daterange
dimension is at the top level. But, if a daterange
value is farther down—which makes for more efficient querying—it's more likely to occur.
Is this just a lower-priority something to document? (Or maybe Adobe has already documented it?
example
{ "rsid": "xxxxx", "globalFilters": [ { "type": "segment", "segmentId": "s300006681_5fb4322d9bfec132bd9dbd73" }, { "type": "dateRange", "dateRange": "2020-10-01T00:00:00.000/2020-11-17T00:00:00.000" } ], "metricContainer": { "metrics": [ { "columnId": "metrics/orders:::0", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_1" ] }, { "columnId": "metrics/orders:::2", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_3" ] }, { "columnId": "metrics/orders:::4", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_5" ] }, { "columnId": "metrics/orders:::6", "id": "metrics/orders", "filters": [ "STATIC_ROW_COMPONENT_7" ] } ], "metricFilters": [ { "id": "STATIC_ROW_COMPONENT_1", "type": "segment", "segmentId": "s300006681_5fb436e8c89a963fe7b144fc" }, { "id": "STATIC_ROW_COMPONENT_3", "type": "segment", "segmentId": "s300006681_5fb436e88b03436b0b75d7d1" }, { "id": "STATIC_ROW_COMPONENT_5", "type": "segment", "segmentId": "s300006681_5fb436e87d03f65a10e92c56" }, { "id": "STATIC_ROW_COMPONENT_7", "type": "segment", "segmentId": "s300006681_5fb436e81212e663fcb463b3" } ] }, "settings": { "countRepeatInstances": true, "dimensionSort": "asc" }, "statistics": { "functions": [ "col-max", "col-min" ] } }
Having the ability to fully defined a segment within R—rather than needing to define it in AW and then reference it by ID—would be useful.
The use case would be to be able to have a base segment and then swap out multiple "one change" values for it.
For instance, wanting to look at mobile traffic for each channel (in a way that simply using multiple dimensions wouldn't work) and being able to pull a list of channels and then have a function that combines each channel with "Mobile Phone" in a segment.
This would make for code that could be readily repurposed across different companies.
Currently, the following argument in aw_get_calculatedmetrics()
will only include the tags column:
expansion = "tags, modified"
The following, however, will return both tags and modified:
expansion = "tags,modified"
This is because the space after the comma in the first example means that the second value does not get properly passed to the API.
I would actually prefer the following notation be the actual expected/accepted one:
expansion = c("tags", "modified")
But, at a minimum, using the existing notation but accounting for "comma-space" situations would potentially prevent frustration.
This is the same suggestion as #42 , but for the aw_getcalculatedmetrics()
function. The following arguments seem like they would be more intuitive if they accepted a string (one value) or a vector (multiple values):
rsids
filterByIds
expansion
I got the following messages when running an aw_freeform_table()
query:
Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
Request failed [429]. Retrying in 7 seconds...
A total of 1985 rows have been pulled.
There is no information about the failed request:
At a minimum, something like the following would be reassuring:
Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
Request failed [429]. Retrying in 7 seconds...
Retry successful. Continuing with additional requests...
A total of 1985 rows have been pulled.
I don't know if this is an API issue (in which case we can just document it) or whether it's an issue with the function.
segmentable = TRUE
returns just the metrics that can be used in segments.segmentable = FALSE
, though, returns all metrics. In theory, it should just return the handful that are not available in segments (which I don't know why anyone would ever want that): bounces, unique visitors, etc.I had to go digging and then experimenting to even confirm what "segmentable" was. Including this link as the one reference I found: https://experienceleague.adobe.com/docs/analytics/analyze/analysis-workspace/workspace-faq/aw-limitations.html?lang=en#known-limitations-in-analysis-workspace
Also for reference, through experimentation, these are the metrics that I expected segmentable = FALSE
to return:
locate the position of the daterangeweek in the list of dimensions and check for if 0 and then change it with the number of weeks usinglength(seq(date_range, by = 'week'))
The current default for aa_freeform_table()
is include_unspecified=FALSE
. However, if a call is made where there are Unspecified values returned, the function fails.
A current example of the messages when running aw_freeform_table()
:
Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
A total of 1985 rows have been pulled.
So, there's some information provided up front: an estimated runtime and then the fact that the first data request has run. But...then there's nothing.
It would be nice to get periodic updates on progress—not necessarily every request, but, based on the estimated total requests, every request or every 5 requests or every 10 requests. So, in the example above, logic could be that there are an estimated 397 requests, and that's greater than 200, so an "every 50" increment would be used:
Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
50 of 398 possible data requests complete. Starting the next 348 requests.
100 of 398 possible data requests complete. Starting the next 298 requests.
150 of 398 possible data requests complete. Starting the next 248 requests.
200 of 398 possible data requests complete. Starting the next 198 requests.
250 of 398 possible data requests complete. Starting the next 148 requests.
300 of 398 possible data requests complete. Starting the next 98 requests.
350 of 398 possible data requests complete. Starting the next 48 requests.
398 of 398 possible data requests complete. Starting the next 397 requests.
A total of 1985 rows have been pulled.
This would need some additional thought. It also could be, "provide ~15 updates" so the total estimated would be divided by 15 and then that's the increment that would then be used.
Alternative, a "messages" argument could have options to include these notifications or not.
When attempting to use the aw_workspace_report()
function on a JSON file without 'metric filters' one receives the following error 'object 'finalmnames' not found
'.
From a prioritization perspective this is likely low value as one could use aw_freeform_table()
to successfully fetch the 'same' request.
Image of error message below. The text that is within the sample json file is within the text file below (can't upload json file here).
'metrics/' is not included when custom metrics are used in query
{"columnId": "5", "id": "metrics/event22" },
{"columnId": "6", "id": "cm300006896_5fac6262d1a4a8555835dc5c"}
In the documentation for aw_freeform_table(), for the arguments dimensions, metrics, segmentID - it may be helpful to specify where they can find the API field names which need to be used in the arguments. This may be especially needed for segmentIDs.
Using the id and title/name, create a list of pretty names for each dimension and metric in the call and then return back the names by names(dat) <- prettynames
Theoretically, this would add a few seconds to the function time but would be an additional attribute int he function so it could be turned on and off.
Call metrics
Call dimensions
Trim to id/name
join with expected final names and produce final pretty name list to use on final output if TRUE
It's, technically, a ranked report if the dimension is "daterangeday" and the metric is, say, "visits," but the expectation (and behavior inside AW) is that the data will be sorted by date.
Currently, the above comes back sorted by the first metric which is confusing.
May not be a bug, but I just can't get this to work.
The code:
aw_get_calculatedmetric_byid(company_id = company_id,
id = "cm300008117_5eb9b63009282640d73ca30b")
The id
I pulled straight from an aw_get_calculatedmetrics()
call. I tried several different values, both calculated metrics I owned and ones that I didn't, and I got a NULL
result in all cases.
I can't get ownerId
to work.
It's not clear from the documentation as to whether this should be using the owner.id
or the owner.login
value.
As an example, neither of these works:
Using the owner.id
value
aw_get_calculatedmetrics(rsid = rsid,
company_id = company_id,
ownerId = "[added owner.id value from an unfiltered call]")
Using the owner.login
value
aw_get_calculatedmetrics(rsid = rsid,
company_id = company_id,
ownerId = "added the owner.login value--email address--from a call w/ ownerFullName expansion]")
Both return:
Error in aw_call_api(req_path = urlstructure[1], company_id = company_id) :
Forbidden (HTTP 403).
The fact that I was trying to filter on values that I was getting from a less filtered version of the same function all means I do have access to that data.
@benrwoodard Is it possible to get how many times the user logged in for the last 2 months' time frame? Does Adobe Analytics API 2.0 have that available?
Add the segmentId filter to the function in the Global Function section
daterangeday
comes back as a string from aa_freeform_report()
. It would be nice if that got converted to be a date when it existed.
Since aa_token()
was renamed to aw_token()
, should the resulting token file be renamed from aa.oauth
to aw.oauth
?
I made the following call:
aw_get_calculatedmetrics(rsid = rsid,
company_id = company_id)
I compared that to the results returned from this:
aw_get_calculatedmetrics(rsid = rsid,
company_id = company_id,
expansion = "reportSuiteName, ownerFullName, modified, tags, definition, compatability, categories")
There were only two additional columns added:
reportSuiteName
-- that seems right/expectedsiteTitle
-- that...wasn't in the list of valuesAny idea what might be going on here?
it's a minor thing but as you have been adding recommendations I've seen you utilize a naming convention that makes good sense. Using 'aw_' to prevent conflicts with other functions but then concatenate the action/verb and name of the element. example:
aw_get_calculatedmetrics to aw_getcalculatedmetrics
aw_get_metrics to aw_getmetrics
aw_get_dimensions to aw_getdimensions
...
Simplify where simplification will not remove recognition:
aw_freeform_report to aw_freeform
aw_anomaly_report to aw_anomaly
*where the expected result is a table of data, report is not needed.
What do you think?
Current example of messages when running aw_freeform_table()
:
Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
A total of 1985 rows have been pulled.
Since queries will often get run multiple times with no or minimal adjustments, it would be nice to have one more line that reports how long the actual runtime was:
Estimated runtime: 318.4sec./5.31min.
1 of 398 possible data requests complete. Starting the next 397 requests.
A total of 1985 rows have been pulled.
Actual runtime: 284.3/ 4.73min.
This would be fairly straightforward as it would be limited to an expected 2 dimensions and return results in a single line of data that could be turned into a venn diagram of some kind along with the data for review.
Sample request:
{
"rsid": "ageo1xxpnwsdi2020prod",
"globalFilters": [
{
"type": "dateRange",
"dateRange": "2020-10-26T00:00:00.000/2020-11-25T00:00:00.000"
}
],
"metricContainer": {
"metrics": [
{
"columnId": "metrics/visitors:::0",
"id": "metrics/visitors",
"filters": [
"0"
]
},
{
"columnId": "metrics/visitors:::1",
"id": "metrics/visitors",
"filters": [
"1"
]
},
{
"columnId": "metrics/visitors:::2",
"id": "metrics/visitors",
"filters": [
"2",
"3"
]
}
],
"metricFilters": [
{
"id": "0",
"type": "segment",
"segmentId": "Mobile_Hits"
},
{
"id": "1",
"type": "segment",
"segmentId": "First_Time_Visits"
},
{
"id": "2",
"type": "segment",
"segmentId": "Mobile_Hits"
},
{
"id": "3",
"type": "segment",
"segmentId": "First_Time_Visits"
}
]
},
"settings": {
"countRepeatInstances": true
},
"statistics": {
"functions": [
"col-max",
"col-min"
]
}
}
I tried this on two separate companies. One only had one calculated metric with any tags, and it had two tags. The other had multiple metrics with a single tag.
The following is an example from the latter—GSK:
aw_get_calculatedmetrics(rsid = rsid,
company_id = company_id,
tagnames = "CXO")
In all cases, the tagnames
filter had no impact on the output—the results were simply all calculated metrics.
Change to default to NA to align with Adobe recommendations
I am using R Version 1.3.959. When installing it is showing a namespace issue. Have you seen this error before?
The current function is limited to basic pulls. The next iteration will require the end user to create a json file and pass that path into the arguments but it will accept any copied json request from the debugger.
When using granularity=week along with quickView = T, it shows an error message:
Error in FUN(X[[i]], ...) : object 'day' not found
(works fine when quickView argument removed)
Sample code used:
test <- aw_anomaly_report(date_range = c('2020-10-1', '2020-11-10'),
metrics = c('visits','visitors'),
granularity = 'week',
quickView = T)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.