Home > Net >  Create column with data and float data types
Create column with data and float data types

Time:01-13

I work with a dataframe named emails_visits: pandas is imported

    Rep  Doctor       Date   type
0     1       1 2021-01-25  email
1     1       1 2021-05-29  email
2     1       2 2021-03-15  email
3     1       2 2021-04-02  email
4     1       2 2021-04-29  email
30    1       2 2021-06-01  visit
5     1       3 2021-01-01  email

I want to create column "date_after" based on value in column type if it is equal to "visits" I would like to see date from column "date" otherwise empty.

I use this code:

emails_visits["date_after"]=np.where(emails_visits["type"]=="visit",emails_visits["Date"],np.nan)

However, it raise an error:

emails_visits["date_after"]=np.where(emails_visits["type"]=="visit",emails_visits["Date"],np.nan)
      File "<__array_function__ internals>", line 5, in where
    TypeError: The DType <class 'numpy.dtype[datetime64]'> could not be promoted by <class 'numpy.dtype[float64]'>. This means that no common DType exists for the given inputs. For example they cannot be stored in a single array unless the dtype is `object`. The full list of DTypes is: (<class 'numpy.dtype[datetime64]'>, <class 'numpy.dtype[float64]'>)

How can I fix this?

CodePudding user response:

You can do it like this if you want.

email_visits['date after'] = email_visits.apply(lambda x: x[2] if x[3] == 'visit' else '', axis=1)

CodePudding user response:

The type datetime64 of the column Date of emails_visits is incompatible with the one of np.nan which is a np.float64. Since it seems you use Pandas, you need to use pd.NA instead which is used for missing values (while np.nan means that the value is not a number and only applies for floating-point numbers). In fact, it is better not to use np.where here but pandas functions. Here is a simple solution:

emails_visits["date_after"] = emails_visits["Date"].where(emails_visits["type"]=="visit")
  •  Tags:  
  • Related