zackthoutt / nfl-player-stats Goto Github PK

Recorded statistics for all football players to ever play in the NFL

License: MIT License

Python 100.00%

nfl-player-stats's Issues

Resolve stats across players on a team

What I mean here is that we should resolve certain stats from one player to other players in the game. For example, if the QB is sacked, we should add a field for lineman that tracks the number of sacks they allowed.

This will take some brainstorming to think of all the stats we want to resolve to different players after the scrape has finished.

Extremely small typo in kicker stat

One of the kicker stats in the json is "point_after_attemps" - I believe it should actually be "point_after_attempts".

Team IDs

Regarding future improvement 2 :

Extract Team IDs to make relating players across teams and games easier

Did you have in mind a Team ID other than their three letter code (e.g. DAL)? Perhaps a unique team ID for each year of a franchise, or an ID for each unique group of players? This could be used to distinguish high-performing vs. low-performing teams that are part of the same franchise but played in different seasons.

Migrate saving data to SQLite database

In the long term (when we have daily/weekly scrape jobs) we will want a more robust setup, but for now it would be nice to save things into an SQLite database so that we can easily query the data and start to get an idea for how we want to structure the database records going forward.

Scrape College Data

Either through the link on pro-football-reference.com or through another site.

Problems for scraping players stats

The script raised this exception when I tried to scrape the players stats. I made the test in Jupyter:

ConnectionError: HTTPSConnectionPool(host='www.pro-football-reference.comhttps', port=443): Max retries exceeded with url: //stathead.com/football/pgl_finder.cgi?player_id=AaitIs00 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc45c241520>: Failed to establish a new connection: [Errno -2] Name or service not known'))

Universal Player, Game, and Team ID's

I think we need to come up with a way to resolve IDs across sites for when we expand our scraping. So instead of blindly assigning players integer IDs, we could take the hash of their name, birthday, and hometown or something. It seems like that would be a fairly unique combo. The only catch here is that we only want to use info that is likely to be on every site we scrape. Some sites might not have the player hometown, so we would not be able to resolve the ID.

Let's brainstorm the best ways to resolve IDs for players, games, and teams so that we can keep single database records for each while pulling from many data sources.

Initial Model Ideas

We can discuss these here or on Slack.

More Player Data

We should research where we can find more granular player data, such as fumbles, passes defended, etc. We might have to explore scraping other sites.

zackthoutt / nfl-player-stats Goto Github PK

nfl-player-stats's Issues

Resolve stats across players on a team

Extremely small typo in kicker stat

Team IDs

Migrate saving data to SQLite database

Scrape College Data

Problems for scraping players stats

Universal Player, Game, and Team ID's

Initial Model Ideas

More Player Data

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent