Home > OS >  Filter pandas dataframe based on date format
Filter pandas dataframe based on date format

Time:01-20

I have dataframe which has two date formats:

  Date   Sales  
Mar-81  1020.2     
Jun-81   889.2     
Sep-81   795.0    
Dec-81  1003.9    
Mar-82  1057.7   

this format continues until Dec-00. As the year transition from 2000 to 2001, the format changes as below:

 Date    Sales  
1-Mar   1014.2  
1-Jun    952.6  
1-Sep    792.4  
1-Dec    922.3  
2-Mar   1055.9  

This is supposed to be March-2001, June-2001, etc.

I am trying to filter this data based on this format and convert the corresponding rows to datetime format:

if df[df['Date'].str.endswith('\d ')]:    #endswith digits
    df['Date'] = pd.to_datetime(df['Date'], format='%b-%y')
else:
    df['Date'] = df['Date'].apply(lambda x: x.zfill(6))
    df['Date'] = pd.to_datetime(df['Date'], format='%y-%b')

I get the below error when I run this: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Not sure what I am doing wrong here.

CodePudding user response:

One way using pandas.to_datetime's errors:

s = df["Date"]
s2 = pd.to_datetime(s, format='%b-%y', errors="coerce")
s2.fillna(pd.to_datetime(s.str.zfill(6), format='%y-%b', errors="coerce"))

Output:

0   1981-03-01
1   1981-06-01
2   1981-09-01
3   1981-12-01
4   1982-03-01
5   2001-03-01
6   2001-06-01
7   2001-09-01
8   2001-12-01
9   2002-03-01
Name: Date, dtype: datetime64[ns]

Your method is not working since df[df['Date'].str.endswith('\d ')] (i.e. a subset of dataframe) cannot be directly converted to either True or False.

CodePudding user response:

The if statement is on a data frame which has multiple boolean values. That is what the error means.

Switch the order. Make a function that works on a single value and apply it to the dates.

def to_date(x):
    if re.search('\d $', x) is not None:
        return pd.to_datetime(x, format='%b-%y')
    else:
        return pd.to_datetime(x.zfill(6), format='%y-%b')

df['Date'] = df['Date'].apply(to_date) 
  •  Tags:  
  • Related