dynastyprocess / data Goto Github PK
View Code? Open in Web Editor NEWAn open-data fantasy football repository, maintained by DynastyProcess.com
Home Page: https://dynastyprocess.com
License: GNU General Public License v3.0
An open-data fantasy football repository, maintained by DynastyProcess.com
Home Page: https://dynastyprocess.com
License: GNU General Public License v3.0
Describe the data bug
Several players are not joined in the ID mapping file
** Which file is having trouble?**
https://github.com/dynastyprocess/data/blob/master/files/db_playerids.csv
Expected data
What should this look like?
player | pos | team | age | draft_year |
---|---|---|---|---|
Saquon Barkley | RB | NYG | NA | NA |
Courtland Sutton | WR | DEN | NA | NA |
Dak Prescott | QB | DAL | NA | NA |
Blake Jarwin | TE | DAL | NA | NA |
O.J. Howard | TE | TBB | NA | NA |
Tarik Cohen | RB | CHI | NA | NA |
Tyrell Williams | WR | LVR | NA | NA |
JJ Arcega-Whiteside | WR | PHI | NA | NA |
Marlon Mack | RB | IND | NA | NA |
Adam Trautman | TE | NOS | NA | NA |
Cole Kmet | TE | CHI | NA | NA |
James Proche | WR | BAL | NA | NA |
Jalen Hurts | QB | PHI | NA | NA |
Albert Okwuegbunam | TE | DEN | NA | NA |
Jordan Love | QB | GBP | NA | NA |
Damien Williams | RB | KCC | NA | NA |
Harrison Bryant | TE | CLE | NA | NA |
Lots of missing IDs for 2021 rookies on db_playerids.csv - I'll happily try to gather some together this coming weekend unless you have it in the pipeline?
Describe the data bug
This code chunk:
ffscrapr::dp_playerids()
Returns the following:
Request failed [503]. Retrying in 1.3 seconds...
Request failed [503]. Retrying in 1.7 seconds...
Error: GitHub request failed with error: <503>
while calling <https://github.com/DynastyProcess/data/raw/master/files/db_playerids.csv>
Which file is having trouble?
Which file?
db_playerids.csv
Expected data
What should this look like?
This should return the csv as a dataframe.
Hey there, found my way to your git via https://www.reddit.com/r/DynastyFF/comments/gzzaa1/question_for_all_you_data_driven_folks_do_any_of/
I looked through your database file (https://github.com/DynastyProcess/data/blob/master/files/database.csv) and that's honestly pretty darn close...
how often is this refreshed and can we get the 2020 draft class in this data set?
are you able to get ADP from different sites under different constraints? (aka league size/league scoring filters ... espn, mfl, yahoo, sleeper, etc???) and are they broken down by site?
It looks like you have aggregated consensus rankings from fantasy pros... are you able to extract the individual rankings that create those overall/info from each site?
Here's what I'm trying to do-
I don't need any of the statistics, b/c i'm just building a draft board that has my own rankings but the 40-times and stuff is great .... one of the things i'd like to do is build out an aggregate ADP using info from all of the sites and look at consensus rankings for (where possible) both redraft and dynasty. I have my own weighting system (i weight different sites differently for example) so i have my draft board sheet in the workbook - i then want to build out a data source page with this information - i would either grab the data 1x (a week before my draft) or hopefully be able to update that data over time as it changes.
Let me know if that makes sense and if what i'm looking for is possible from your data set?
Describe the data bug
There are many gsis_id
values missing for new players like Ahmad Gardner in nflreadr::load_ff_playerids()
, but these same gsis_id
values are available via nflreadr::load_player_stats()
. It seems like nflreadr::load_ff_playerids()
should load gsis_id
automatically from nflreadr::load_player_stats()
automatically so as to allow for joins to other ID variables.
Describe the data bug
Chris Herndon's Sleeper ID should be 5009, not 5755
** Which file is having trouble?**
db_playerids.csv
There is an old, deprecated Chris Herndon in Sleeper's player database:
"5755": {
"hashtag": "#ChrisHerndon-NFL-FA-0",
"sport": "nfl",
"practice_description": null,
"birth_state": null,
"team": null,
"injury_notes": null,
"last_name": "Herndon",
"injury_start_date": null,
"weight": "",
"fantasy_data_id": 19914,
"search_first_name": "chris",
"college": "Miami (Fla.)",
"age": 22,
"first_name": "Chris",
"full_name": "Chris Herndon",
"search_rank": 9999999,
"yahoo_id": 900358,
"birth_date": "1996-02-23",
"injury_status": null,
"news_updated": null,
"position": null,
"metadata": null,
"birth_city": null,
"rotowire_id": null,
"high_school": null,
"rotoworld_id": 13228,
"player_id": "5755",
"number": 0,
"birth_country": null,
"pandascore_id": null,
"search_last_name": "herndon",
"years_exp": 0,
"height": "",
"active": true,
"stats_id": null,
"search_full_name": "chrisherndon",
"depth_chart_position": null,
"depth_chart_order": null,
"sportradar_id": "780a48de-d092-4e87-9c34-8d1b45a154cc",
"gsis_id": "00-0034766",
"injury_body_part": null,
"fantasy_positions": null,
"status": "Inactive",
"espn_id": null,
"practice_participation": null
}
The Chris Herndon that is actually on rosters is this one:
"5009": {
"hashtag": "#ChrisHerndon-NFL-NYJ-89",
"sport": "nfl",
"practice_description": null,
"birth_state": null,
"team": "NYJ",
"injury_notes": null,
"last_name": "Herndon",
"injury_start_date": null,
"weight": "253",
"fantasy_data_id": 19947,
"search_first_name": "chris",
"college": "Miami (FL)",
"age": 24,
"first_name": "Chris",
"full_name": "Chris Herndon",
"search_rank": 290,
"yahoo_id": 31077,
"birth_date": "1996-02-23",
"injury_status": null,
"news_updated": 1593620408253,
"position": "TE",
"metadata": null,
"birth_city": null,
"rotowire_id": 12899,
"high_school": null,
"rotoworld_id": 13228,
"player_id": "5009",
"number": 89,
"birth_country": null,
"pandascore_id": null,
"search_last_name": "herndon",
"years_exp": 2,
"height": "6'4\"",
"active": true,
"stats_id": 832080,
"search_full_name": "chrisherndon",
"depth_chart_position": "TE",
"depth_chart_order": 1,
"sportradar_id": "780a48de-d092-4e87-9c34-8d1b45a154cc",
"gsis_id": " 00-0034766",
"injury_body_part": null,
"fantasy_positions": [
"TE"
],
"status": "Active",
"espn_id": 3123050,
"practice_participation": null
}
Describe the data bug
Weekly job for updating playerids is failing. Looks to not have successfully completed since last month. The effect is that some players are associated with the wrong team.
** Which file is having trouble?**
Which file?
db_playerids.csv
Expected data
What should this look like?
db_playerids.csv
updates weekly and gets accurate team info linked for all players.
In db_fpecr_latest.csv
what is the id
field? I'm trying to build a table to map around this but struggling to understand where this id
is coming from.
I noticed that ras_id
and otc_id
live as variables in the missing_ids.csv file here on dynastyprocess, but are not currently variables in nflreadr::load_ff_playerids(). This file and the function it feeds should mirror each other, yes? Looks like the csv has useful data for the function.
Thanks for creating this database, it's very helpful. I've only looked into this a little, but it looks like Rotoworld uses different IDs for the player's page URL, the API calls to get news about the player, and the player's profile image. The ID listed in this database appears to be the profile image ID. It would be helpful to have all three, depending on what the user's goal is. If you're going to choose only one, I might recommend the player URL ID, because you can get the other two from there.
For example, for Aaron Rodgers the URL is: https://www.rotoworld.com/football/nfl/player/7815/aaron-rodgers
The ID for his profile image (the one used in DynastyProcess) is 3118
The ID used for searching news articles is 39071
remotes::install_github("dfs-with-r/ffespn")
library(ffscrapr)
library(ffespn)
library(tidyverse)
#Build df listing all ESPN position designations at current moment.
espnqblist_2022 <- ffespn_projections(2022, 0, "QB", league_id = espn_league_id_2022) %>%
select(id:position)
espnrblist_2022 <- ffespn_projections(2022, 0, "RB", league_id = espn_league_id_2022) %>%
select(id:position)
espnwrlist_2022 <- ffespn_projections(2022, 0, "WR", league_id = espn_league_id_2022) %>%
select(id:position)
espntelist_2022 <- ffespn_projections(2022, 0, "TE", league_id = espn_league_id_2022) %>%
select(id:position)
espnpklist_2022 <- ffespn_projections(2022, 0, "K", league_id = espn_league_id_2022) %>%
select(id:position)
espnpnlist_2022 <- ffespn_projections(2022, 0, "P", league_id = espn_league_id_2022) %>%
select(id:position)
espndtlist_2022 <- ffespn_projections(2022, 0, "DT", league_id = espn_league_id_2022) %>%
select(id:position)
espndelist_2022 <- ffespn_projections(2022, 0, "DE", league_id = espn_league_id_2022) %>%
select(id:position)
espnlblist_2022 <- ffespn_projections(2022, 0, "LB", league_id = espn_league_id_2022) %>%
select(id:position)
espncblist_2022 <- ffespn_projections(2022, 0, "CB", league_id = espn_league_id_2022) %>%
select(id:position)
espnslist_2022 <- ffespn_projections(2022, 0, "S", league_id = espn_league_id_2022) %>%
select(id:position)
espnlist_2022 <- bind_rows(espnqblist_2022,
espnrblist_2022,
espnwrlist_2022,
espntelist_2022,
espnpklist_2022,
espnpnlist_2022,
espndtlist_2022,
espndelist_2022,
espnlblist_2022,
espncblist_2022,
espnslist_2022) %>%
rename(espn_id = id,
espn_pos = position,
espn_team = team)
#Update ffscrapr's ESPN ID's by building df w/ missing IDs (usually rookies)
espn_id_adds <- dp_playerids() %>%
rename(dp_espn_id = espn_id,
player = name) %>%
full_join(., y = espnlist_2022,
by = c("player")) %>%
select(player,
position,
espn_pos,
team,
espn_team,
dp_espn_id,
espn_id) %>%
filter(espn_id %in% dp_espn_id == FALSE)
yields 228 updates to ESPN ID's ("OAK" Sam Williams needs no updating). Please add!
Recently, the structure of the various value.csv files changed. It used to be the case that they had a "mergename" column that matched the same column in database.csv. However, it is now just "player", and the name used doesn't match the mergename from database.csv or the joined first_name + last_name columns.
I had been using the data in database.csv to link the player values to the mfl_ids. If there is an alternative way to do that, please let me know.
In the FF player Ids, the 00-0029435 seems duplicated, and is missatributed.
load_ff_playerids() %>% filter(gsis_id == "00-0029435") %>% select(name,gsis_id)
name gsis_id
1 Dennis Johnson 00-0029435
2 Dennis Johnson 00-0029435
You can see here how it shows twice. But when you look for example at 2014 week 1 game for Houston, you can see how they atribute this to D.Johnson.
load_pbp(2014) %>% filter(game_id=="2014_01_WAS_HOU",play_id==83) %>% select(receiver,receiver_id)
receiver receiver_id
1 D.Johnson 00-0029435
Looking at PFR you can see how all teh targets where to Damaris Johnson, not Dennis
https://www.pro-football-reference.com/boxscores/201409070htx.htm
#24 had some new IDs, make sure those work ok
Hi guys,
I'd like to give a try at a pull request updating ESPN ID's for the 2023 rookies. I'm noticing that the missing ID's file doesn't have 2023 rookies. Should I submit a Pull Request for db_playerids.csv
instead?
I think the Randy Gregory and Nick Williams espn_id variables might be wrong in dp_playerids(). I'm working on updating the espn_id and using the ffespn package, the only IDs that aren't matching are these two players. Not sure which is right/wrong.
remotes::install_github("dfs-with-r/ffespn")
remotes::install_github("ffverse/ffscrapr", ref = "dev")
remotes::install_github("nflverse/nflreadr")
library(tidyverse)
library(ffespn)
library(ffscrapr)
library(nflreadr)
espn_list <- bind_rows(ffespn_projections(2021, 0, "QB") %>% select(-notes),
ffespn_projections(2021, 0, "RB") %>% select(-notes),
ffespn_projections(2021, 0, "WR") %>% select(-notes),
ffespn_projections(2021, 0, "TE") %>% select(-notes),
ffespn_projections(2021, 0, "K") %>% select(-notes),
ffespn_projections(2021, 0, "P") %>% select(-notes),
ffespn_projections(2021, 0, "DT") %>% select(-notes),
ffespn_projections(2021, 0, "DE") %>% select(-notes),
ffespn_projections(2021, 0, "LB") %>% select(-notes),
ffespn_projections(2021, 0, "CB") %>% select(-notes),
ffespn_projections(2021, 0, "S") %>% select(-notes)
) %>%
select(new_espn_id = id,
name = player,
team) %>%
mutate(name = clean_player_names(name),
team = clean_team_abbrs(team))
updated_playerids <- dp_playerids() %>%
mutate(name = clean_player_names(name),
team = clean_team_abbrs(team)) %>%
left_join(.,
espn_list,
by = c("name", "team")
)
espn_match <- updated_playerids %>%
select(name,
team,
position,
espn_id,
new_espn_id)
view(espn_match)
check <- espn_match %>%
filter(!is.na(espn_id) & !is.na(new_espn_id) & espn_id != new_espn_id)
view(check)
Hey Tan, I linked PFF IDs to the GSIS IDs and added DOBs as well. Hope this helps. https://docs.google.com/spreadsheets/d/1H-gXbB5M9KQ3TzIcBMN-m-xCrw11aK6gqYyY2TfKNto/edit?usp=sharing
There is some older fantasypros scrapedata that could/should get manipulated into the current structure
In load_ff_playerids, there are four cases of duplicate gsis_id's. The one I have researched the most is two rows where gsis_id == '00-0016098', both listed as Fred Taylor, with different birthdays and some other data.
I cant find almost any information on the 2005 Fred Taylor (like this table says drafted 32nd pick of the 7th round, but was not according to the 2005 draft I looked at). The one thing I found using the mfl_id listed was this on MFL that matches the given information (like the incorrect draft spot): https://www72.myfantasyleague.com/2000/player?L=0&P=8058. Though appears I got lucky randomly throwing in 2000 as the year while seeing if you could look up players by id this way on MFL as its the only one I was able to get it to show up at -4 years of experience.
I don't know how this list is maintained and if its done automatically so the above site existing means this row will exist, but thought I'd bring attention to it. And figured at the very least the 00-0016098 gsis_id probably shouldn't belong to this row.
The other duplicates are for 00-0019641, 00-0020270, and 00-0029435
Maybe not super useful, but a reproducible list of the duplicates:
nflreadr::load_ff_playerids() %>%
filter(!is.na(gsis_id)) %>%
group_by(gsis_id) %>%
summarize(n = n()) %>%
filter(n > 1)
00-0019641: I believe the incorrect one is the 1990 draft year row
00-0020270: I believe the incorrect one is the 1985 draft year row
Like the Fred Taylor one, both above have less info for that row and doesn't show up in the year it says they were drafted
00-0029435 is the only one of these that it seems that both rows are real players. Both have PFR pages, but I think that they were incorrectly given the same gsis_id and pff_id. I believe the 2013 draft_year row should be gsis_id '00-0030236' based on there being rushing play by play data for a RB Dennis Johnson in 2013 with that gsis_id
I found a duplicate gsis_id
Both MFL ids 11724 & 14683 gave a gsis_id of 00-0034641
I think gsis_id of 00-0034641 goes with MFL ID 14683 (Chris Jones)
I know this isn't necessarily a data request but I just wanted to know how I'm supposed to read the chart and understand each player's value for a dynasty league in 2021. Thanks in advance!
Describe the data you'd like to have
I'm looking for historical ECR or ADP values. I know how to find the current year's, but is there a way to get last year? Or the year before? Do you keep that kind of archive and if so how far back does it go?
A snapshot at some point before the start of the season would be fine.
Thanks!
Reported by Nick W / electronicks
I'm loving the map of player IDs in the mfl_players() function. Wondering if a few ID columns could get added/mapped:
nflscrapR / (I think this is old NFL API). Example: Kyle Orton = 00-0023541
nflfastR / (new NFL API). Example: Kirk Cousins = 32013030-2d30-3032-3936-303492e9d55e
You can find their code easily on GitHub
I'm not sure how you collate the player IDs, but I've got some contributions to make if that's possible:
Chris Herndon seems to have duplicate IDs in Sleeper's database: 5009 and 5755 - 5755 is a free agent, 5009 looks to be the real Chris Herndon
Anthony Gordon (MFL ID 14787) missing Sleeper ID: 6898
Marquez Callaway (MFL ID 15034) missing Sleeper ID: 6989
Mike Warren (MFL ID 14816) missing Sleeper ID: 6992
Benny Snell (MFL ID 14072) missing Sleeper ID: 6156
Quartney Davis (MFL ID 14856) missing Sleeper ID: 6879
Salvon Ahmen (MFL ID 14811) missing Sleeper ID: 6918
Jeff Thomas (MFL ID 14866) missing Sleeper ID: 7076
JaMycal Hasty (MFL ID 14821) missing Sleeper ID: 6996
Patrick Taylor (MFL ID 14817) missing Sleeper ID: 6963
Thaddeus Moss (MFL ID 14869) missing Sleeper ID: 6919
I have an app that's using the awesome .csv to help analyse some rosters. Depending on how you create the csv, maybe I could contribute these in a more automated way - or maybe not! Have a chat on Twitter DMs if you want?
Love your trade calculator for use with my Dynasty league. I was looking at a trade involving the new Saints #2 RB, Tony Jones Jr., and couldn't find him in the system. Is that an oversight or am I doing something wrong? Thanks!
Describe the data bug
Weekly FantasyPros (FP) ranks not updated since January
** Which file is having trouble?**
fp_latest_weekly.csv
fp_latest_weekly.rds
Expected data
Updated with the fantasypros week 1 ranks for 2022
Describe the data bug
The default view of FantasyPros dynasty rankings mixes in some really old rankings (from this summer or farther), "take locking" value
Which file is having trouble?
Player-Values.csv
Expected data
Taylor and CEH were booted from top 15 dynasty assets as an example. Michael Gallup is now worth overall double what he was two days ago etc
Allows for status updates and scheduling to be visible
Describe the data bug
Using load_ff_playerids(), Tanner Conner and Michael Woods don't have gsis-ids, it returns NA for both of them.
** Which file is having trouble?**
db_playerids.csv
Expected data
Tanner Conner should have a gsis_id of 00-0037348 and Michael Woods should have an id of 00-0037300
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.