Home > Enterprise >  Pandas df['col1':'col2'] giving the output I don't understand
Pandas df['col1':'col2'] giving the output I don't understand

Time:01-30

I was doing something very basic like this -

data = np.arange(1,13).reshape(4,3)
table = pd.DataFrame(data, index = list('abcd'), columns =['foo','bar','baz'])
table

  foo bar baz
a  1   2   3
b  4   5   6
c  7   8   9
d 10  11  12

And then I ran this -

table['bar':'foo']
#output

  foo bar baz
c  7   8   9
d 10  11  12

I don't get why I am getting this result. Note that I am not asking for any other solution or workaround. I am just looking for explanation/rules behind this behavior.

CodePudding user response:

It's basically outputting row slices by comparing bar and foo lexicographically with the existing column names. The output includes column c and d as they're only two columns that fall between bar and foo: a < b < bar < c < d < ... < foo

CodePudding user response:

I'm not entirly sure, but it looks like you can't use slicing for column names, the slicing only works on the rows, so only c and d are (lexicography) between bar and foo

You can instead use loc:

table.loc[:, 'foo':'bar']

Note that I changed the order of foo and bar, this is because they are ordered as you defined them, foo -> baz -> bar and not lexicographically. 'bar':'foo' will return an empty dataframe.

  •  Tags:  
  • Related