1st Desired Output:
Final Desired Output:
Thanks to all who take a look.
import pandas as pd
df = pd.read_excel('ActionNetwork.xlsx')
df['Matchups']= df['Scheduled'].apply(lambda x: x.split('PM | AM')[-1])
print(df)
CodePudding user response:
Since you want to put the matchups back into the same df, the teams need to go into separate columns (otherwise it will throw an error about uneven row counts).
Instead of multiple splits, use a single str.extract with the following patterns:
.*[AM|PM]-- match everything up toAMorPM(but don't capture it)([0-9] [a-zA-Z ] )-- capture 1 numbers and 1 letters/spaces (away team)([0-9] [a-zA-Z ] )-- capture 1 numbers and 1 letters/spaces (home team)
pattern = r'.*[AM|PM] ([0-9] [a-zA-Z ] )([0-9] [a-zA-Z ] )'
df[['Away', 'Home']] = df['Scheduled'].str.extract(pattern)
# Scheduled Away Home
# 0 Fri 10/29, 11:30 PM 113NavyNAVY114TulsaTLSA 113NavyNAVY 114TulsaTLSA
# 1 Sat 10/30, 2:00 AM 114UNLVUNLV116NevadaNEV 114UNLVUNLV 116NevadaNEV
# 2 Sat 10/30, 11:00 PM 110Ole MissMISS118AuburnAUB 110Ole MissMISS 118AuburnAUB



