I am trying to capture (S3) logs in a structured way. I am capturing the access-related elements with this type of tuple:
class _Access(NamedTuple):
time: datetime
ip: str
actor: str
request_id: str
action: str
key: str
request_uri: str
status: int
error_code: str
I then have a class that uses this named tuple as follows (edited just down to relevant code):
class Logs:
def __init__(self, log: str):
raw_logs = match(S3_LOG_REGEX, log)
if raw_logs is None:
raise FormatError(log)
logs = raw_logs.groups()
timestamp = datetime.strptime(logs[2], "%d/%b/%Y:%H:%M:%S %z")
http_status = int(logs[9])
access = _Access(
timestamp,
logs[3],
logs[4],
logs[5],
logs[6],
logs[7],
logs[8],
http_status,
logs[10],
)
self.access = access
The problem is that it is too verbose when I now want to use it:
>>> log_struct = Logs(raw_log)
>>> log_struct.access.action # I don't want to have to add `access`
As I mention above, I'd rather be able to do something like this:
>>> log_struct = Logs(raw_log)
>>> log_struct.action
But I still want to have this clean named tuple called _Access. How can I make everything from access available at the top level?
Specifically, I have this line:
self.access = access
which is giving me that extra "layer" that I don't want. I'd like to be able to "unpack" it somehow, similar to how we can unpack arguments by passing the star in *args. But I'm not sure how I can unpack the tuple in this case.
CodePudding user response:
What you really need for your use case is an alternative constructor for your NamedTuple subclass to parse a string of a log entry into respective fields, which can be done by creating a class method that calls the __new__ method with arguments parsed from the input string.
Using just the fields of ip and action as a simplified example:
from typing import NamedTuple
class Logs(NamedTuple):
ip: str
action: str
@classmethod
def parse(cls, log: str) -> 'Logs':
return cls.__new__(cls, *log.split())
log_struct = Logs.parse('192.168.1.1 GET')
print(log_struct)
print(log_struct.ip)
print(log_struct.action)
This outputs:
Logs(ip='192.168.1.1', action='GET')
192.168.1.1
GET
CodePudding user response:
I agree with @blhsing and recommend that solution. This is assuming that there are not extra attributes required to be apply to the named tuple (say storing the raw log value).
If you really need the object to remain composed, another way to support accessing the properties of the _Access class would be to override the __getattr__ method [PEP 562] of Logs
The
__getattr__function at the module level should accept one argument which is the name of an attribute and return the computed value or raise an AttributeError:def __getattr__(name: str) -> Any: ...If an attribute is not found on a module object through the normal lookup (i.e.
object.__getattribute__), then__getattr__is searched in the module__dict__before raising anAttributeError. If found, it is called with the attribute name and the result is returned. Looking up a name as a module global will bypass module__getattr__. This is intentional, otherwise calling__getattr__for builtins will significantly harm performance.
E.g.
from typing import NamedTuple, Any
class _Access(NamedTuple):
foo: str
bar: str
class Logs:
def __init__(self, log: str) -> None:
self.log = log
self.access = _Access(*log.split())
def __getattr__(self, name: str) -> Any:
return getattr(self.access, name)
When you request an attribute of Logs which is not present it will try to access the attribute through the Logs.access attribute. Meaning you can write code like this:
logs = Logs("fizz buzz")
print(f"{logs.log=}, {logs.foo=}, {logs.bar=}")
logs.log='fizz buzz', logs.foo='fizz', logs.bar='buzz'
Note that this would not preserve the typing information through to the Logs object in most static analyzers and autocompletes. That to me would be a compelling enough reason not to do this, and continue to use the more verbose way of accessing values as you describe in your question.
If you still really need this, and want to remain type safe. Then I would add properties to the Logs class which fetch from the _Access object.
class Logs:
def __init__(self, log: str) -> None:
self.log = log
self.access = _Access(*log.split())
@property
def foo(self) -> str:
return self.access.foo
@property
def bar(self) -> str:
return self.access.bar
This avoids the type safty issues, and depending on how much code you write using the Logs instances, still can cut down on other boilerplate dramatically.
