So I have a composition in which why my wrapper class contains a Pandas DataFrame. Because I want to add some behaviors to it, I've setup the following:
class DataFrameWrapper(dict):
def __init__(self, dataframe, *args, **kwargs):
self.df = dataframe
self.pandas_callables = [method_name for method_name in dir(self.df)
if callable(getattr(self.df, method_name))]
super(DataFrameWrapper, self).__init__()
def __getitem__(self, item):
return self.df[item]
def __getattr__(self, item):
if item in self.pandas_callables:
# this is a dataframe method call - forward to the dataframe & return that
return object.__getattribute__(self.df, item)
else:
try: # try to return our own attribute, if any
return object.__getattribute__(self, item)
except AttributeError as ex:
# likely a panda attribute. IF not, then it's a genuine attribute
# error, so we don't catch it and let it raise another exception
return object.__getattribute__(self.df, item)
Then I have say
class Foo(DataFrameWrapper):
def __init__(self, dataframe, *args, **kwargs):
super().__init__(dataframe, *args, **kwargs)
class Bar(Foo)
""" concrete implementation class """
class Baz(Foo)
""" concrete implementation class """
So that we can do:
bar = Bar(df)
bar.to_json()
bar.some_custom_method()
col = bar["column_name"]
Now, if I do say:
json = bar.to_json()
This works fine. However, I'd like to add additional processing in Foo, so I want to do:
class Foo(DataFrameWrapper):
def to_json(*args, **kwrags)
# do additional stuff
super().to_json(*args, **kwrags)
However in that case, __getattr__() is never called in the wrapper and I just get
AttributeError: 'super' object has no attribute 'to_json'
Why?
EDIT:
If I do something dumb, like this:
class DataFrameWrapper(dict):
### previous code unchanged ###
def __getattribute__(self, item):
if item == "to_json":
return object.__getattr__(self, item)
return object.__getattribute__(self, item)
Then the call to to_json() works. According to the docs, I would expect what hack I just did to be what should happen anyways.
CodePudding user response:
Okay, taking a stab at an attempt of an answer, based on @martineau comments above.
So yes, super() is actually not simply calling the parent's methods directly. Because Python supports multiple inheritance, a call to super().foo() must do a bit more than just calling the (first) parent class' __getattribute__. It has to establish some mechanism to ensure that python's implementation of multiple inheritance is coherent.
That being, if I call the parent's method directly (old-school according to Python super()'s guide), then it works as I expected initially:
class Foo(DataFrameWrapper):
def to_json(self, *args, **kwargs):
# do stuff
return DataFrameWrapper.__getattr__(self, "to_json")(*args, **kwargs)
That "solves" the issue in the sense that I get the functionality expected.
Now to understand WHY super() didn't work:
super has a __getattribute__ method, but not a __getattr__ method. Thus I am guessing that what happens might be along the lines of:
super().__getattribute__("to_json")is calledsupergoes through the MRO to try to find the attribute- According to comments below, it looks into each classe's dict for the attribute it's looking for. Guessing it doesn't actually call each classe's getattribute methods for efficiency reason.
- See comments below for example details. Maybe eventually I'll structure that into the answer.
