I'm trying to find get a list of required names from list of names using a regex query.
csv file: FYI, I converted Countries from Capital to small letters

searchList:
['AU.LS1_james.aus',
'AU.LS1_scott.aus',
'AP.LS1_amanda.usa',
'AP.LS1_john.usa',
'LA.LS1_harsha.ind',
'LA.LS1_vardhan.ind']
I'm trying to get a list of each group like this,
[
['AU.LS1_james.aus', 'AU.LS1_scott.aus'],
['AP.LS1_amanda.usa', 'AP.LS1_john.usa'],
['LA.LS1_harsha.ind', 'LA.LS1_vardhan.ind']
]
Using the following regex query: \<({region}).*\{country}\>
for region, country in regionCountry:
query = f"\<({region}).*\{country}\>"
r = re.compile(query)
group = list(filter(r.match, searchList))
I tried re.search as well, but the group is always None
FYI: I also tried this query in notepad find using regex functionality.
Can Anyone Tell where it's going wrong in my script.? Thank you
CodePudding user response:
Without regex:
split- And a dictionary to group the entries:
Data
entries = ['AU.LS1_james.aus', 'AU.LS1_scott.aus', 'AP.LS1_amanda.usa', 'AP.LS1_john.usa', 'LA.LS1_harsha.ind', 'LA.LS1_vardhan.ind']
Solution 1: simple dict and setdefault
d = {}
for entry in entries:
d.setdefault(entry.split('.',1)[0], []).append(entry)
Solution 2: defaultdict
from collections import defaultdict
d = defaultdict(list)
for entry in entries:
d[entry.split('.',1)[0]].append(entry)
Result is in d.values()
>>> list(d.values())
[['AU.LS1_james.aus', 'AU.LS1_scott.aus'],
['AP.LS1_amanda.usa', 'AP.LS1_john.usa'],
['LA.LS1_harsha.ind', 'LA.LS1_vardhan.ind']]
CodePudding user response:
I thank you all for trying to assist my question. This answer worked out well for my usage. For some reason python doesn't like \< and \>. so i just removed them and it worked fine. I didn't expect that there could be some limitations using re library.
Answer:
({region}).*\{country}

