Home > Net >  Determining Fantasy Football Optimal Rosters in R
Determining Fantasy Football Optimal Rosters in R

Time:01-12

I have downloaded all of the data from my fantasy football league and have been doing some analysis in R. One thing I am struggling to do is figure out a way to determine what would have been each players optimal team each week. That is, each player in my league chose a set of their players to start and a set of their players to bench. In hindsight, we know what each player actually scored and can determine whether or not the player chose correctly.

My data is structured in the following way:

  • Team = Fantasy Team (2 included in example 10 in total)
  • Week = What week this occurred on (1 included in example 16 in total)
  • Slot = What position this player was played in
  • Player = Players name
  • Position = Players NFL position
  • FPTS = Points Scored
structure(list(Team = c("Washington Beersnake", "Washington Beersnake", 
"Washington Beersnake", "Washington Beersnake", "Washington Beersnake", 
"Washington Beersnake", "Washington Beersnake", "Washington Beersnake", 
"Washington Beersnake", "Washington Beersnake", "Washington Beersnake", 
"Washington Beersnake", "Washington Beersnake", "Washington Beersnake", 
"Washington Beersnake", "Washington Beersnake", "Dimmadome Doug", 
"Dimmadome Doug", "Dimmadome Doug", "Dimmadome Doug", "Dimmadome Doug", 
"Dimmadome Doug", "Dimmadome Doug", "Dimmadome Doug", "Dimmadome Doug", 
"Dimmadome Doug", "Dimmadome Doug", "Dimmadome Doug", "Dimmadome Doug", 
"Dimmadome Doug", "Dimmadome Doug", "Dimmadome Doug"), Week = c(1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1), SLOT = c("QB", "RB", "RB", "WR/TE", 
"WR/TE", "WR/TE", "FLEX", "D/ST", "K", "Bench", "Bench", "Bench", 
"Bench", "Bench", "Bench", "Bench", "QB", "RB", "RB", "WR/TE", 
"WR/TE", "WR/TE", "FLEX", "D/ST", "K", "Bench", "Bench", "Bench", 
"Bench", "Bench", "Bench", "Bench"), PLAYER = c("Justin Herbert", 
"Alvin Kamara", "Saquon Barkley", "Terry McLaurin", "Keenan Allen", 
"Brandon Aiyuk", "Myles Gaskin", "Buccaneers D/ST", "Matt Gay", 
"William Fuller V", "AJ Dillon", "Marvin Jones Jr.", "Justin Fields", 
"Jalen Reagor", "J.D. McKissic", "Darnell Mooney", "Kyler Murray", 
"David Montgomery", "Miles Sanders", "Tyreek Hill", "DK Metcalf", 
"DeVonta Smith", "Raheem Mostert", "Washington D/ST", "Younghoe Koo", 
"Javonte Williams", "Corey Davis", "Matt Ryan", "Henry Ruggs III", 
"Curtis Samuel", "Nyheim Hines", "Tony Pollard"), Position = c("QB", 
"RB", "RB", "WR", "WR", "WR", "RB", "D/ST", "K", "WR", "RB", 
"WR", "QB", "WR", "RB", "WR", "QB", "RB", "WR", "WR", "WR", "WR", 
"WR", "D/ST", "K", "RB", "WR", "QB", "WR", "WR", "RB", "RB"), 
    FPTS = c(15.38, 15.1, 2.7, 6.2, 10, 0, 7.6, 3, 12, 0, 2.6, 
    13.7, 6.7, 10.9, 0.8, 2.6, 41.56, 17.8, 13.3, 26.1, 12, 13.1, 
    2, 4, 6, 4.1, 21.7, 7.36, 4.6, 0, 8.2, 4.3)), row.names = c(NA, 
-32L), class = c("tbl_df", "tbl", "data.frame"))

Data Example

I would like to add a new column to the end of the data frame called "optimal slot". The logic for this would be as follows.

  • QB = highest scoring player with QB position on the team that week
  • RB1 = highest scoring player with RB position on the team that week
  • RB2 = second highest scoring player with RB position on the team that week
  • WR/TE1 = highest scoring player with WR or TE position on the team that week
  • WR/TE2 = second highest scoring player with WR or TE position on the team that week
  • WR/TE3 = third highest scoring player with WR or TE position on the team that week
  • RB/WR/TE = highest remaining player with RB/WR/TE position on the team that week
  • D/ST = highest scoring player with D/ST position on the team that week
  • K = highest scoring player with K position on the team that week
  • BENCH = all remaining players on the team that week

This needs to be figured out for each player each week (the reason 2 players were included in data example)

CodePudding user response:

I found it easier to tackle this with some reference tables. The FLEX slot makes things a little tricky in that we have to find the optimal selections first and then pick the best remaining eligible player.

Anyway, this is probably overkill but should scale nicely.

library(tidyverse)

#ASSUME ORIGINAL DATA IS STORED AS df

#To start, it might be helpful to have some reference tables.
#First, what positions are eligible to be placed in what slots?
possible_slots <- tibble(Position = c("QB", "RB", "RB", "WR", "WR", "TE", "TE", "D/ST", "K"),
               possible_slot = c("QB", "RB", "FLEX", "WR/TE", "FLEX", "WR/TE", "FLEX", "D/ST", "K"))

#Next, for each slot, how many players can be selected?
topn <- tibble(possible_slot = c("QB", "RB", "WR/TE", "D/ST", "K"),
               n_slots = c(1, 2, 3, 1, 1))

#Ok, now let's create a record for every potential slot that a player could conceivably take
all_possible_slots <- df %>% 
  left_join(possible_slots, by = "Position") %>% 
  left_join(topn, by = "possible_slot") %>% 
  #For each team, week and possible slot, rank the possiblity by points (descending)
  group_by(Team, Week, possible_slot) %>% 
  #Note the use of ties.method = first to ensure only one record per rank
  mutate(rank = rank(-FPTS, ties.method = "first")) %>% 
  ungroup()

#Now limit just to those records where the rank is <= the number of available slots
optimal_slots <- all_possible_slots %>% 
  filter(rank <= n_slots)

#The FLEX slot is special case, we need to select the best REMAINING player,
#i.e. one not selected in optimal_slots
flex_slot <- all_possible_slots %>% 
  #limit just to the FLEX slot and remove anyone selected in optimal_slots
  filter(possible_slot == "FLEX") %>% 
  anti_join(optimal_slots %>% select(Team, Week, PLAYER)) %>% 
  #Among those remaining, take the highest scoring
  group_by(Team, Week) %>% 
  filter(rank(-FPTS, ties.method = "first") == 1) %>% 
  ungroup() 

#Now let's bring it all together with the original data.
df %>% 
  #append the optimal slot (including flex)
  left_join({
    optimal_slots %>% 
      bind_rows(flex_slot) %>% 
      #concatenate the slot and the rank in that slot
      mutate(optimal_slot = paste0(possible_slot, 
                                   ifelse(possible_slot %in% c("RB", "WR/TE"), rank, ""))) %>% 
      select(Team, Week, PLAYER, optimal_slot)
  }, by = c("Team", "Week", "PLAYER")) %>% 
  #Anyone without an optimal slot should be benched
  mutate(optimal_slot = coalesce(optimal_slot, "BENCH"))


# A tibble: 32 x 7
Team                  Week SLOT  PLAYER           Position  FPTS optimal_slot
<chr>                <dbl> <chr> <chr>            <chr>    <dbl> <chr>       
  1 Washington Beersnake     1 QB    Justin Herbert   QB        15.4 QB          
2 Washington Beersnake     1 RB    Alvin Kamara     RB        15.1 RB1         
3 Washington Beersnake     1 RB    Saquon Barkley   RB         2.7 BENCH       
4 Washington Beersnake     1 WR/TE Terry McLaurin   WR         6.2 FLEX        
5 Washington Beersnake     1 WR/TE Keenan Allen     WR        10   WR/TE3      
6 Washington Beersnake     1 WR/TE Brandon Aiyuk    WR         0   BENCH       
7 Washington Beersnake     1 FLEX  Myles Gaskin     RB         7.6 RB2         
8 Washington Beersnake     1 D/ST  Buccaneers D/ST  D/ST       3   D/ST        
9 Washington Beersnake     1 K     Matt Gay         K         12   K           
10 Washington Beersnake     1 Bench William Fuller V WR         0   BENCH       
# ... with 22 more rows

CodePudding user response:

All of your bullet points require the same technique so it is sufficient to solve for a single example:

highest_score = df %>%
  filter(SLOT = "QB") %>%
  ungroup() %>%
  summarise(max_score = max(FTPS))

highest_scoring_player_df = df %>%
  filter(SLOT = "QB",
         FTPS = highest_score) %>%
  select(PLAYER)

highest_scoring_player = highest_scoring_player_df[1,1]

df = df %>%
  mutate(optimal_slot = ifelse(PLAYER == highest_scoring_player, "QB", optimal_slot))

After repeating this pattern for each position, you can set optimal_slot to BENCH for all players who are yet to be assigned an optimal slot:

df = df %>%
  mutate(optimal_slot = ifelse(is.na(optimal_slot), "BENCH", optimal_slot))
  •  Tags:  
  • Related