Home > Software design >  How to sort rows in pandas data_frame.info()
How to sort rows in pandas data_frame.info()

Time:02-01

noob question
I can't figure out how/if the object output from a pandas data frame .info() call can be sorted like a regular data frame.

example:

import pandas as pd
temp = pd.DataFrame(data={"x":[1, 2, 3, None, 4], "y":[5, 6, 7, None, None]})
temp.info(null_counts=True).sort_values(by="Non-Null Count")

results in: AttributeError: 'NoneType' object has no attribute 'sort_values'

(context: I have a lot of columns and varying numbers of missing values I want to sort the columns by)

CodePudding user response:

Internally Pandas has a DataFrameInfo class that you can use to get at the info() data programatically. You can turn this into a DataFrame, which can then be sorted.

import pandas as pd
from pandas.io.formats.info import DataFrameInfo

temp = pd.DataFrame(data={"x":[1, 2, 3], "y":[4, 5, 6]})

info = DataFrameInfo(data=temp)
infodf = pd.DataFrame(
  {'Column': info.ids, 
   'Non-Null Count':info.non_null_counts, 
   'Dtype':info.dtypes})

print(infodf)

Output:

  Column  Non-Null Count  Dtype
x      x               3  int64
y      y               3  int64

CodePudding user response:

Sort your columns before info:

df[df.notna().sum().sort_values().index].info()

Demo

data = np.random.default_rng(2022).choice([np.nan, 1], (100, 26), p=(.3, .7))
df = pd.DataFrame(data, columns=list('ABCDEFGHIJKLMNOPQRSTUVWXYZ'))
>>> df[df.notna().sum().sort_values().index].info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 26 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   B       61 non-null     float64
 1   U       63 non-null     float64
 2   H       63 non-null     float64
 3   Z       64 non-null     float64
 4   T       64 non-null     float64
 5   O       64 non-null     float64
 6   L       65 non-null     float64
 7   S       66 non-null     float64
 8   N       66 non-null     float64
 9   Y       66 non-null     float64
 10  K       66 non-null     float64
 11  P       67 non-null     float64
 12  A       67 non-null     float64
 13  I       67 non-null     float64
 14  D       67 non-null     float64
 15  W       68 non-null     float64
 16  M       68 non-null     float64
 17  R       69 non-null     float64
 18  J       70 non-null     float64
 19  F       71 non-null     float64
 20  G       72 non-null     float64
 21  Q       73 non-null     float64
 22  V       73 non-null     float64
 23  C       74 non-null     float64
 24  X       74 non-null     float64
 25  E       79 non-null     float64
dtypes: float64(26)
memory usage: 20.4 KB

  •  Tags:  
  • Related