I was doing something very basic like this -
data = np.arange(1,13).reshape(4,3)
table = pd.DataFrame(data, index = list('abcd'), columns =['foo','bar','baz'])
table
foo bar baz
a 1 2 3
b 4 5 6
c 7 8 9
d 10 11 12
And then I ran this -
table['bar':'foo']
#output
foo bar baz
c 7 8 9
d 10 11 12
I don't get why I am getting this result. Note that I am not asking for any other solution or workaround. I am just looking for explanation/rules behind this behavior.
CodePudding user response:
It's basically outputting row slices by comparing bar and foo lexicographically with the existing column names. The output includes column c and d as they're only two columns that fall between bar and foo: a < b < bar < c < d < ... < foo
CodePudding user response:
I'm not entirly sure, but it looks like you can't use slicing for column names, the slicing only works on the rows, so only c and d are (lexicography) between bar and foo
You can instead use loc:
table.loc[:, 'foo':'bar']
Note that I changed the order of foo and bar, this is because they are ordered as you defined them, foo -> baz -> bar and not lexicographically. 'bar':'foo' will return an empty dataframe.
