Giter Site home page Giter Site logo

robwhickman / statsbombr Goto Github PK

View Code? Open in Web Editor NEW

This project forked from statsbomb/statsbombr

1.0 1.0 0.0 101 KB

This repository is an R package to easily stream StatsBomb data from the API using your log in credentials or from the Open Data GitHub repository cost free into R .

R 100.00%

statsbombr's Introduction

StatsBombR

By: StatsBomb

Updated July 16, 2019.

This repository is an R package to easily stream StatsBomb data into R using your log in credentials for the API or free data from our GitHub page. API access is for paying customers only

This package offers a parallel option to most computationally expensive functions. However, it is currently only designed for Windows.

Installation Instructions:

  1. If not yet installed into R, run: install.packages("devtools")
  2. Then, install this R package as: devtools::install_github("statsbomb/StatsBombR")
  3. Finally, library(StatsBombR)

This package depends on several other packages in order for all functions to run. Therefore, if you have problems with any functions or with installing the package, it is likely due to package dependencies.

Free Data

Free Data Instructions:

Welcome to the Free Data Offerings from StatsBomb Services.

This package is reading in the open access dat found on https://github.com/statsbomb/open-data. Below you will find a list of the functions used to quickly read in all open data currently available. Check back often as new data is regularly added.

Free Data Description and Privacy Policy

StatsBomb are committed to sharing new data and research publicly to enhance understanding of the game of Football. We want to actively encourage new research and analysis at all levels. Therefore we have made certain leagues of StatsBomb Data freely available for public use for research projects and genuine interest in football analytics.

StatsBomb are hoping that by making data freely available, we will extend the wider football analytics community and attract new talent to the industry. We would like to collect some basic personal information about users of our data. By giving us your email address, it means we will let you know when we make more data, tutorials and research available. We will store the information in accordance with our Privacy Policy and the GDPR.

Whilst we are keen to share data and facilitate research, we also urge you to be responsible with the data. Please register your details on https://www.statsbomb.com/resource-centre and read our User Agreement carefully.

Terms & Conditions

By using this repository, you are agreeing to the user agreement.

If you publish, share or distribute any research, analysis or insights based on this data, please state the data source as StatsBomb and use our logo.

To read in all free events available:

StatsBombData <- StatsBombFreeEvents()

To read in all of the free competitions we offer simply run:

FreeCompetitions()

or, for use in other functions, store it as a data frame object:

Comp <- FreeCompetitions()

To read in the free matches available:

Matches <- FreeMatches(Comp)

To read in free events for a certain game:

get.matchFree(Matches[1,])

It is important to note, that the argument here is the entire row returns from "FreeMatches", this is because there is information from each match observation that is needed in the get.matchFree function.

API Data

API Access Instructions:

API access is for paying customers only

To read in the competitions available through StatsBomb, run:

  1. competitions <- competitions(username, password)

To read in the matches available in each competition, run:

  1. matches <- get.matches(username, password, season_id, competition_id)

To read in all of the matches for various competitions.

  1. Pull Competitions From the API: comps <- competitions(username, password)
  2. Filter for the competitions you want: EuropeComps <- comps %>% filter(country_name == "Europe")
  3. Create a matrix of the competition and season ids: competitionmatrix <- as.matrix(EuropeComps[,1:2])
  4. Pull all of the matches: Matches <- MultiCompMatches(username, password, competitionmatrix)

To read in events for one game, simply run:

  1. StatsBombData <- get.events(username, password, match_id)

Note: A previous version of this function was named get.match(), get.match() is now deprecated).

To read in events for multiple games, run:

  1. Create a vector of match IDs:matchids <- matchesvector(username, password, season_id, competition_id)
  2. StatsBombData <- allevents(username, password, matchids)

Note: See documentation for additional parameters available to access different API versions, run in parallel or not, choose a specific number of cores. (A previous version of this function was named allmatches(), allmatches() is now deprecated).

To read in all of the events for various competitions.

  1. Pull Competitions From the API: comps <- competitions(username, password)
  2. Filter for the competitions you want: EuropeComps <- comps %>% filter(country_name == "Europe")
  3. Create a matrix of the competition and season ids: competitionmatrix <- as.matrix(EuropeComps[,1:2])
  4. Pull all of the events: Events <- MultiCompEvents(username, password, competitionmatrix)

To read in the lineups for one game, run:

  1. lineups <- get.lineups(username, password, match_id)

To read in multiple lineups, run:

  1. matchids <- matchesvector(username, password, season_id, competition_id)
  2. StatsBombLineups <- alllineups(username, password, matchids, parallel = T)

To unnest all of the lineups:

StatsBombLineups <- cleanlineups(StatsBombLineups)

Data Cleaning Helpers:

Although JSON files can often be a pain to clean, especially due to nested data frames, these helper functions may make your data wrangling much easier.

To clean all of the data at once:

StatsBombData <- allclean(StatsBombData)

This function cleans the data in one line of code by running each of the functions below sequentially.

To clean all of the location variables simply run:

StatsBombData <- cleanlocations(StatsBombData)

Please note all location variables must be present in the data set. This function will not work with a subset of variables (i.e. if any location variables are missing).

To add the goalkeeper information from the freeze frame:

StatsBombData <- goalkeeperinfo(StatsBombData)

Please note that additional information is located under type.name == "Goal Keeper" and within the Freeze Frames.

To add additional shot information:

StatsBombData <- shotinfo(StatsBombData)

To extract some information from the freeze frame:

StatsBombData <- freezeframeinfo(StatsBombData)

Description of these variables:

  • Density is calculated as the aggregated inverse distance for each defender behind the ball.
  • Density in the cone is the density filtered for only defenders who are in the cone between the shooter, and each goal post.

To format the elapsed time from the start of a match:

StatsBombData <- formatelapsedtime(StatsBombData)

To add in information about the current possession within a match:

StatsBombData <- possessioninfo(StatsBombData)

Final Notes:

  • Some of the cleaning functions above depend on variables created in the functions presented before them. In order to be safe, please clean your data in the order that is presented in this document.
  • Please re-install frequently, as new functions and bug fixes will be added regularly.
  • As always, check out the Rdocumentation for each function (ex. ?StatsBombFreeEvents()) for more specific description.
  • Please contact [email protected] with bugs and suggestions.

statsbombr's People

Contributors

deepxg avatar robwhickman avatar yamstats avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.