zackthoutt / nfl-player-stats Goto Github PK
View Code? Open in Web Editor NEWRecorded statistics for all football players to ever play in the NFL
License: MIT License
Recorded statistics for all football players to ever play in the NFL
License: MIT License
What I mean here is that we should resolve certain stats from one player to other players in the game. For example, if the QB is sacked, we should add a field for lineman that tracks the number of sacks they allowed.
This will take some brainstorming to think of all the stats we want to resolve to different players after the scrape has finished.
One of the kicker stats in the json is "point_after_attemps" - I believe it should actually be "point_after_attempts".
Regarding future improvement 2 :
Extract Team IDs to make relating players across teams and games easier
Did you have in mind a Team ID other than their three letter code (e.g. DAL)? Perhaps a unique team ID for each year of a franchise, or an ID for each unique group of players? This could be used to distinguish high-performing vs. low-performing teams that are part of the same franchise but played in different seasons.
In the long term (when we have daily/weekly scrape jobs) we will want a more robust setup, but for now it would be nice to save things into an SQLite database so that we can easily query the data and start to get an idea for how we want to structure the database records going forward.
Either through the link on pro-football-reference.com
or through another site.
The script raised this exception when I tried to scrape the players stats. I made the test in Jupyter:
ConnectionError: HTTPSConnectionPool(host='www.pro-football-reference.comhttps', port=443): Max retries exceeded with url: //stathead.com/football/pgl_finder.cgi?player_id=AaitIs00 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fc45c241520>: Failed to establish a new connection: [Errno -2] Name or service not known'))
I think we need to come up with a way to resolve IDs across sites for when we expand our scraping. So instead of blindly assigning players integer IDs, we could take the hash of their name, birthday, and hometown or something. It seems like that would be a fairly unique combo. The only catch here is that we only want to use info that is likely to be on every site we scrape. Some sites might not have the player hometown, so we would not be able to resolve the ID.
Let's brainstorm the best ways to resolve IDs for players, games, and teams so that we can keep single database records for each while pulling from many data sources.
We can discuss these here or on Slack.
We should research where we can find more granular player data, such as fumbles, passes defended, etc. We might have to explore scraping other sites.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.