My respects, colleagues. I need to write a function that determines the maximum number of consecutive BA, CA character pairs per line.
print(f("BABABA125")) # -> 3
print(f("234CA4BACA")) # -> 2
print(f("BABACABACA56")) # -> 5
print(f("1BABA24CA")) # -> 2
Actually, I've written a function, but, to my mind, it's not very good.
def f(s: str) -> int:
res = 0
if not s:
return res
cur = 0
i = len(s) - 1
while i >= 0:
if s[i] == "A" and (s[i-1] == "B" or s[i-1] == "C"):
cur = 1
i -= 2
else:
if cur > res:
res = cur
cur = 0
i -= 1
else:
if cur > res:
res = cur
return res
In addition, I'm not allowed to use libraries and regular expressions (only string and list methods). Could you please help me or rate my code in this context. I'll be very grateful.
CodePudding user response:
Here's a function f2 that performs this operation.
if not re.search('(BA|CA)', s): return 0
First check if the string actually contains anyBAorCA(to preventValueError: max() arg is an empty sequenceon step 3), and return 0 if there aren't any.matches = re.finditer(r'(?:CA|BA) ', s)
Find all consecutive sequences ofCAorBA, using non-capturing groups to ensurere.finditeroutputs only full matches instead of partial matches.res = max(matches, key=lambda m: len(m.group(0)))
Then, among the matches (re.Matchobjects), fetch the matched substring usingm.group(0)and compare their lengths to find the longest one.return len(res.group(0))//2
Divide the length of the longest result by 2 to get the number ofBAorCAs in this substring. Here we use floor division//to coerce the output into anint, since division would normally convert the answer tofloat.
import re
strings = [
"BABABA125", # 3
"234CA4BACA", # 2
"BABACABACA56", # 5
"1BABA24CA", # 2
"NO_MATCH_TO_BE_FOUND", # 0
]
def f2(s: str):
if not re.search('(BA|CA)', s): return 0
matches = re.finditer(r'(?:CA|BA) ', s)
res = max(matches, key=lambda m: len(m.group(0)))
return len(res.group(0))//2
for s in strings:
print(f2(s))
UPDATE: Thanks to @StevenRumbalski for providing a simpler version of the above answer. (I split it into multiple lines for readability)
def f3(s):
if not re.search('(BA|CA)', s): return 0
matches = re.findall(r'(?:CA|BA) ', s)
max_length = max(map(len, matches))
return max_length // 2
if not re.search('(BA|CA)', s): return 0
Same as abovematches = re.findall(r'(?:CA|BA) ', s)
Find all consecutive sequences ofCAorBA, but each value inmatchesis astrinstead of are.Match, which is easier to handle.max_length = max(map(len, matches))
Map each matched substring to its length and find the maximum length among them.return max_length // 2
Floor divide the length of the longest matching substring by the length ofBA,CAto get the number of consecutive occurrences ofBAorCAin this string.
CodePudding user response:
Here's an alternative implementation without any imports. Do note however that it's quite slow compared to your C-style implementation.
The idea is simple: Transform the input string into a string consisting of only two types of characters c1 and c2, with c1 representing CA or BA, and c2 representing anything else. Then find the longest substring of consecutive c1s.
The implementation is as follows:
- Pick a char that is guaranteed not to appear in the input string; here we use
as an example. Then pick a char different from the previous one; here we use-. - Replace each occurrence of
CAandBAwith a. - Replace everything else in the string (that is not a
) with a-(this is whycannot be present in the original input string). Now we have a string consisting purely ofs and-s. - Split the string with
-as delimiter, and map each resulting substring to their length. - Return the maximum of these substring lengths.
strings = [
"BABABA125", # 3
"234CA4BACA", # 2
"BABACABACA56", # 5
"1BABA24CA", # 2
"NO_MATCH_TO_BE_FOUND", # 0
]
def f4(string: str):
string = string.replace("CA", " ")
string = string.replace("BA", " ")
string = "".join([(c if c == " " else "-") for c in string])
str_list = string.split("-")
str_lengths = map(len, str_list)
return max(str_lengths)
for s in strings:
print(f4(s))
