DataFrame filter column-CodePudding

I have the following dataframe 'X_df'

which city has the 5th highest total number of Walmart stores (super stores and regular stores combined)?

data_url = 'https://raw.githubusercontent.com/plotly/datasets/master/1962_2006_walmart_store_openings.csv'
x_df = pd.read_csv(data_url, header=0)

x_df['STRSTATE'].where(x_df['type_store'] == 7)

CodePudding user response：

You can use Dataframe.max() to get the max city count the get the city name

X_df=df[X_df['city_count']==X_df['city_count'].max()]

x_df["city_name"]

CodePudding user response：

Edit:

I think something like this is what you want? :

data_url = 'https://raw.githubusercontent.com/plotly/datasets/master/1962_2006_walmart_store_openings.csv'
x_df = pd.read_csv(data_url, header=0)

city_store_count = x_df.groupby(['STRCITY']).size().sort_values(ascending = False).to_frame()
city_store_count.columns = ['Stores_in_City']
city_store_count.iloc[4]

The fifth biggest is actually a shared 3rd place with ten stores, so you could print the top 10 for instance:

city_store_count.head(10)