hi was solving this question where i had to return how long the longest common subsequence(lcs) is and i was able to form a recursive function that does the same but now i wonder how to make this function return what the lcs is.
like for example
seq1 = 'serendipitous'
seq2 = 'precipitation'
lcs of seq1 and seq2 is - reipito
i want to make a function that return 'reipito'
the function that i wrote to return the length of lcs is
def lcs_recursive(seq1, seq2, idx1 = 0, idx2 = 0):
if idx1 == len(seq1) or idx2 == len(seq2):
return 0
if seq1[idx1] == seq2[idx2]:
return 1 lcs_recursive(seq1, seq2, idx1 1, idx2 1)
else:
option1 = lcs_recursive(seq1, seq2, idx1 1, idx2)
option2 = lcs_recursive(seq1, seq2, idx1, idx2 1)
return max(option1, option2)
this function returns the length how lcs
i tired to do this but failed. please help
CodePudding user response:
To convert posted code to return the LCS rather than the length.
def lcs_recursive(seq1, seq2, idx1 = 0, idx2 = 0):
if idx1 == len(seq1) or idx2 == len(seq2):
return ""
if seq1[idx1] == seq2[idx2]:
return seq1[idx1] lcs_recursive(seq1, seq2, idx1 1, idx2 1)
else:
option1 = lcs_recursive(seq1, seq2, idx1 1, idx2)
option2 = lcs_recursive(seq1, seq2, idx1, idx2 1)
return max(option1, option2, key = len)
Test
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(lcs_recursive(seq1, seq2))
# Output: 'reipito'
CodePudding user response:
With minimal changes to your code:
def lcs_recursive(seq1, seq2, idx1 = 0, idx2 = 0, lcs = ''):
if idx1 == len(seq1) or idx2 == len(seq2):
return lcs
if seq1[idx1] == seq2[idx2]:
return lcs_recursive(seq1, seq2, idx1 1, idx2 1, lcs seq1[idx1])
else:
option1 = lcs_recursive(seq1, seq2, idx1 1, idx2, lcs)
option2 = lcs_recursive(seq1, seq2, idx1, idx2 1, lcs)
return max(option1, option2, key=len)
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(lcs_recursive(seq1, seq2)) # reipito
Since you're likely calling lcs_recursive with the same index pairs multiple times, functools.cache if using Python >= 3.9 or functools.lru_cache if using Python >= 3.2 may improve performance.
For example, if seq1 = 'ab' and seq2 = 'bd', lcs_recursive('ab', 'bd') will eventually call lcs_recursive('ab', 'bd', 2, 1) twice (you can see this by doing print(idx1, idx2) first thing in your function). With the @lru_cache optimization bellow, on the first call to (idx1=2, idx2=1) (i.e. lcs_recursive('ab', 'bd', 2, 1)), the result corresponding to (idx1=2, idx2=1) is computed and stored (in a cache, which you can think of as a dictionary). On subsequent calls to the same index pair ((idx1=2, idx2=1)), the stored result is looked up so that it doesn't have to be computed all over again. In general, lru_cache will memoize results based on the input. That's why in the code below, the function helper only takes idx1 and idx2 - to make it easier for lru_cache to work well.
from functools import lru_cache
def lcs_recursive(seq1, seq2):
@lru_cache
def helper(idx1 = 0, idx2 = 0):
if idx1 == len(seq1) or idx2 == len(seq2):
return ''
if seq1[idx1] == seq2[idx2]:
return seq1[idx1] helper(idx1 1, idx2 1)
option1 = helper(idx1 1, idx2)
option2 = helper(idx1, idx2 1)
return max(option1, option2, key=len)
return helper()
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(lcs_recursive(seq1, seq2)) # reipito
As a comparison, with seq1 = 'serendipitous' and seq2 = 'precipitation' lcs_recursive_uncached(seq1, seq2) makes 1119564 recursive calls, whereas lcs_recursive_cached(seq1, seq2) makes just 184 recursive calls!:
number_of_calls_uncached = 0
def lcs_recursive_uncached(seq1, seq2, idx1 = 0, idx2 = 0, lcs = ''):
global number_of_calls_uncached
number_of_calls_uncached = 1
if idx1 == len(seq1) or idx2 == len(seq2):
return lcs
if seq1[idx1] == seq2[idx2]:
return lcs_recursive_uncached(seq1, seq2, idx1 1, idx2 1, lcs seq1[idx1])
else:
option1 = lcs_recursive_uncached(seq1, seq2, idx1 1, idx2, lcs)
option2 = lcs_recursive_uncached(seq1, seq2, idx1, idx2 1, lcs)
return max(option1, option2, key=len)
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(number_of_calls_uncached) # 0
print(lcs_recursive_uncached(seq1, seq2)) # reipito
print(number_of_calls_uncached) # 1119564
from functools import lru_cache
number_of_calls_cached = 0
def lcs_recursive_cached(seq1, seq2):
@lru_cache
def helper(idx1 = 0, idx2 = 0):
global number_of_calls_cached
number_of_calls_cached = 1
if idx1 == len(seq1) or idx2 == len(seq2):
return ''
if seq1[idx1] == seq2[idx2]:
return seq1[idx1] helper(idx1 1, idx2 1)
option1 = helper(idx1 1, idx2)
option2 = helper(idx1, idx2 1)
return max(option1, option2, key=len)
return helper()
seq1 = 'serendipitous'
seq2 = 'precipitation'
print(number_of_calls_cached) # 0
print(lcs_recursive_cached(seq1, seq2)) # reipito
print(number_of_calls_cached) # 184
