Giter Site home page Giter Site logo

gaapi's Introduction

Google Analytics Reporting for Ruby

gaapi provides:

  • A command line executable program to retrieve reporting data from Google Analytics (GA). It takes the user's GA request, specified in JSON format, and sends it to GA. It outputs the result of the request in JSON or comma-separated values (CSV) format
  • A library of classes that can be used in other programs to retrieve reporting data from GA

gaapi supports two ways of providing credentials. One way is more useful while testing scripts or doing ad-hoc queries. The other is more appropriate for unattended script usage. See the Authentication section for more details.

Google provides a Ruby client library that builds queries by constructing them from Ruby objects. gaapi allows you to express queries as JSON. If you prefer the JSON format, you may prefer to use gaapi. If you want to deal with Ruby objects (which are likely more verbose than JSON), use the Google gem.

Installation

For stand-alone use:

gem install gaapi --no-doc

In a Gemfile:

gem 'gaapi'

Usage

Command Line

gaapi [options] VIEW_ID

If no query is specified on the command line, gaapi tries to read the query from standard input.

The VIEW_ID is what identifies the GA data (a view of a property). To find the view ID, log in to GA, select the account of interest, select Admin (the gear near the bottom left of the page), and select "View Settings" (on the right of the page).

Options

    -a, --access-token TOKEN         An access token obtained from https://developers.google.com/oauthplayground.
        --csv                        Output result as a csv file.
    -c, --credentials CREDENTIALS    Location of the credentials file. Default: `.gaapi/ga-api-key`.
    -d, --debug                      Print debugging information.
    -e, --end-date END_DATE          Report including END_DATE (yyyy-mm-dd).
    -n, --dry-run                    Don't actually send the query to Google.
    -q, --query-file QUERYFILE       File containing the query. Default STDIN.
    -s, --start-date START_DATE      Report including START_DATE (yyyy-mm-dd).

If you specify both the -a and -c options, gaapi will use the -a option.

Example

Get the number of visitors to a site for January, 2018, with credentials previously obtained and stored in ./credentials.json:

gaapi -s "2018-01-01" -e "2018-01-31" -c ./credentials.json 000000
{
  "reportRequests": [{
      "viewId": "VIEW_ID",
      "dimensions": [{"name": "ga:date"}],
      "dateRanges": [{
        "startDate": "START_DATE",
        "endDate": "END_DATE"
      }],
      "metrics": [{
          "expression": "ga:users"
      }],
      "includeEmptyRows": true,
      "hideTotals": false,
      "hideValueRanges": true
    }]
}

In a Program

Make sure the program can find GAAPI. Without Rails:

require "gaapi"

With Rails, simply include gaapi in the Gemfile:

gem "gaapi"

Next, get an access token. To run the program unattended, the best way is to use the approach described [here](#Unattended Running), which translates to the following code:

access_token = GAAPI::AccessToken.new("path/to/credential_file")

Set up the query. This may raise exceptions:

begin
  query = GAAPI::Query.new(query_string, 00000000, access_token, "2018-01-01", "2018-06-30")
rescue StandardError => e
  # Handle the error
end

A typical exception would be from a query_string that isn't valid JSON. The query_string has to be a valid GA reporting query. See the Queries section. Because the access token is lazy-evaluated, you may also get an exception here if the credential file doesn't exist or is malformed.

Execute the request:

result = query.execute
if result.success?
  ...
end

If the query was successful, you have access to a few interesting methods:

puts result.body  # raw response body
puts result.pp    # a string formatted into more readable JSON
puts result.csv   # comma-separated values format, ready to be written to a file

There is also some support now for a more structured use of the resulting query. If the query was successful (result.success?), you can use the following:

result.reports    # An array of GAAPI::Report objects
report.dimensions # An array of the dimension names
report.headers    # An array of the dimension names and metric names
report.metrics    # An array of the metric names
report.rows       # An array of GAAPI::Row objects

If you have a Row object, you can access the dimensions and metrics using method names. For example, to get the ga:sessionDuration metric for a row:

row.session_duration

The ga: is stripped from the front of the dimension or metric name, and then the rest is converted to snake case.

You can also get all the dimensions or all the metrics for a row:

row.dimensions
row.metrics

These return arrays with the values in the order corresponding to the report.dimensions and report.metrics arrays.

Putting it all together, to get all the ga:avgSessionDuration from all the rows in all the reports:

result.reports.flat_map do |report|
  report.rows.map do |row|
    row.avg_session_duration
  end
end

Queries

gaapi uses the Google Analytics Reporting API v4 (https://developers.google.com/analytics/devguides/reporting/core/v4/). An introduction to querying for GA data is here: https://developers.google.com/analytics/devguides/reporting/core/v4/basics. A very useful reference of the dimensions and metrics available is at: https://developers.google.com/analytics/devguides/reporting/core/dimsmets.

A query to find basic visit data for a web site is:

{
  "reportRequests": [{
      "viewId": "VIEW_ID",
      "dimensions": [{"name": "ga:date"}],
      "dateRanges": [{
        "startDate": "2017-10-01",
        "endDate": "2017-10-31"
      }],
      "metrics": [{
          "expression": "ga:avgSessionDuration"
        },
        {
          "expression": "ga:pageviewsPerSession"
        },
        {
          "expression": "ga:sessions"
        },
        {
          "expression": "ga:users"
        }
      ],
      "includeEmptyRows": true,
      "hideTotals": false,
      "hideValueRanges": true
    },
    {
      "viewId": "VIEW_ID",
      "dimensions": [{"name": "ga:date"}],
      "dateRanges": [{
        "startDate": "2017-10-01",
        "endDate": "2017-10-31"
      }],
      "metrics": [{
          "expression": "ga:goal1Completions"
        },
        {
          "expression": "ga:goal2Completions"
        },
        {
          "expression": "ga:goal6Completions"
        },
        {
          "expression": "ga:goal8Completions"
        },
        {
          "expression": "ga:goal9Completions"
        },
        {
          "expression": "ga:goal11Completions"
        },
        {
          "expression": "ga:goal13Completions"
        },
        {
          "expression": "ga:goal14Completions"
        },
        {
          "expression": "ga:goal16Completions"
        },
        {
          "expression": "ga:goalCompletionsAll"
        }
      ],
      "includeEmptyRows": true,
      "hideTotals": false,
      "hideValueRanges": true
    },
    {
      "viewId": "VIEW_ID",
      "dimensions": [{"name": "ga:date"}],
      "dateRanges": [{
        "startDate": "2017-10-01",
        "endDate": "2017-10-31"
      }],
      "metrics": [{
          "expression": "ga:avgSessionDuration"
        },
        {
          "expression": "ga:pageviewsPerSession"
        },
        {
          "expression": "ga:sessions"
        },
        {
          "expression": "ga:users"
        }
      ],
      "includeEmptyRows": true,
      "hideTotals": false,
      "hideValueRanges": true
    }
  ]
}

By default, Google Analytics will return a maximum of 1,000 rows. gaapi automatically adds a pageSize: 10000 to your query, if no pageSize is specified. This causes Google Analytics to return 10,000 rows, the maximum that Google Analytics will return.

If gaapi returns 10,000 rows, it's your responsibility to use the nextPageToken in the returned result, to query additional rows.

Authentication

[The introduction to authentication for Google products is here: https://developers.google.com/analytics/devguides/reporting/core/v4/authorization.]

Testing and Ad-Hoc Usage

This method involves cutting and pasting an access token obtained from https://developers.google.com/oauthplayground onto the command line. The access token is simply a long string of characters generated by Google. The access token expires after an hour, so the user has to return to the Google URL to get a new token.

Unattended Running

This method obtains a file of secure credentials from Google. It's very important that these credentials be kept secure, as whoever has a copy of the file, has access to the Google Analytics data for the account.

To use this type of credential with gaapi:

  1. Follow the instructions at: https://developers.google.com/identity/protocols/OAuth2ServiceAccount, choose a JSON format file, and when you're prompted to save a file, save it
  2. Immediately change the permissions of the file to make it readable only by you. On Linux, Unix, OSX that's chmod 600 filename
  3. Give the file name in the --credentials option when you run gaapi, or pass it to AccessToken.new

gaapi's People

Contributors

lcreid avatar

Stargazers

Jonian Guveli avatar

Watchers

James Cloos avatar  avatar Philip M Carrillo avatar

gaapi's Issues

TIME format metrics implemented incorrectly.

The Google documentation claims the following about TIME metrics:

TIME Time metric in HH:MM:SS format.

It also says it's the time in seconds, and that's what I see in real results.

The query should be a string, JSON, or Ruby hash.

Make sure that the query is sent to GA correctly, no matter how the three input formats. In particular this means stringifying keys if it's a Ruby hash, and in general just making sure that the other formats work correctly.

Add support for accessing dimensions and metrics in result

It's not clear to that it's guaranteed that dimensions and metrics will appear in the result, in the same order they were specified in the query. Even if they are, it would certainly make code easier to understand if it was easy to access the result data by the dimension and metric names, rather than their position in an array (which is what is returned by GA).

This probably ties in to providing support for more structured results -- curious that the library that facilitates using JSON for the query should also provide more structure for the result.

This is for the user of gaapi as a library. It won't affect the users of the command line.

CSV output is broken.

CSV output is broken:

/var/lib/gems/2.3.0/gems/gaapi-0.4.2/lib/gaapi/report.rb:45:in `totals': undefined method `[]' for nil:NilClass (NoMethodError)
	from /var/lib/gems/2.3.0/gems/gaapi-0.4.2/lib/gaapi/response.rb:38:in `block (2 levels) in csv'
	from /var/lib/gems/2.3.0/gems/gaapi-0.4.2/lib/gaapi/response.rb:27:in `each'
	from /var/lib/gems/2.3.0/gems/gaapi-0.4.2/lib/gaapi/response.rb:27:in `block in csv'
	from /usr/lib/ruby/2.3.0/csv.rb:1166:in `generate'
	from /var/lib/gems/2.3.0/gems/gaapi-0.4.2/lib/gaapi/response.rb:26:in `csv'
	from /var/lib/gems/2.3.0/gems/gaapi-0.4.2/lib/gaapi/main.rb:40:in `call'
	from /var/lib/gems/2.3.0/gems/gaapi-0.4.2/bin/gaapi:7:in `<top (required)>'
	from /usr/local/bin/gaapi:23:in `load'
	from /usr/local/bin/gaapi:23:in `<main>'

Get debug credentials from app, and refresh them.

Provide a way to get the debug credentials right in the app. And then keep track of the refresh token, so the app keeps refreshing as long as it's in use.

This is fairly non-trivial, because we'd have to simulate a fair bit of browser behaviour in the app.

GA limits on downloads

GA defaults to limiting queries to 1,000 returned results, and doesn't allow requesting more than 10,000 results in a query. Gaapi should make those limitations invisible to the end user.

Values in the `Row#metrics` array.

The values in the Row#metrics array are currently the raw strings returned by GA, which is inconsistent with the value returned by calling the named method for a metric, which converts the value to the appropriate Ruby type (e.g Integer, Float, String).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.