Home > Enterprise >  Python Pandas: How to find in dataframe object type columns which has numeric data?
Python Pandas: How to find in dataframe object type columns which has numeric data?

Time:01-05

In the dataframe, I am trying to find numeric data columns which has dtype as "object". I want to do it automated way rather then looking into actual data within the dataframe.

I tried this, but it didn't work:

for obj_feature in df.select_dtypes(include="object").columns:
    if df[obj_feature].str.isalpha == False:
        print("Numeric data columns", obj_feature)

DDL to generate Dataframe:

import pandas as pd

df = pd.DataFrame({'id': [1, 2, 3],
                  'A': ['Month', 'Year', 'Quater'],
                  'B' : ['29.85', '85.43', '33.87'],
                  'C' : [45, 22, 33.4]})

Thanks!

CodePudding user response:

You can use pandas.api.types.is_numeric_dtype:

from pandas.api.types import is_numeric_dtype
{c: is_numeric_dtype(df[c]) for c in df}

output:

{'id': True, 'A': False, 'B': False, 'C': True}

selecting the numeric columns:

Here use select_dtype:

df.select_dtypes('number')

output:

   id     C
0   1  45.0
1   2  22.0
2   3  33.4

CodePudding user response:

You might use pandas.api.types.is_numeric_dtype, consider following example

import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3],
                  'A': ['Month', 'Year', 'Quater'],
                  'B' : ['29.85', '85.43', '33.87'],
                  'C' : [45, 22, 33.4]})
for colname in df.columns:
    print(colname,pd.api.types.is_numeric_dtype(df[colname]))

output

id True
A False
B False
C True

CodePudding user response:

Not straight forward, the following is a wilcard and is all weather though

First select dtypes='object' Second attempt to coerce them to numeric, setting errors='coerce', what that will do is if alphanumeric, it will output them as NaN giving you the privilege to leverage dropna() and remain with only numeric/object dtypes

Code below

 df.select_dtypes('object').apply(lambda x: pd.to_numeric(x,errors='coerce')).dropna(axis=1)

Outcome

    B
0  29.85
1  85.43
2  33.87
  •  Tags:  
  • Related