Home > Back-end >  split a string to have chunks containing the maximum number of possible characters
split a string to have chunks containing the maximum number of possible characters

Time:01-21

e.g. string = 'bananaban' => ['ban', 'anab', 'an']

My attempt:

def apart(string):
    letters = []
    for i in string:
        while i not in letters:
            letters.append(i)
    print("The letters are:"  str(letters))
    x = []
    result = []
    return result

string = str(input("Enter string: "))
print(apart(string)

Basically, If I know all the letters that are in the word/string, I want to add them into x, until x contains all letters. Then I want to add x into result.

In my examaple "bananaban" it would mean [ban] is one x, because "ban" countains the letter "b","a" and "n". Same goes for [anab]. [an] only contains "a" and "n" because it is the end of the word.

Would be cool if somebody could help me ^^

CodePudding user response:

IIUC, you want to split after all characters are in the current chunk.

You could use a set to keep track of the seen characters:

s = 'bananaban'

seen = set()
letters = set(s)
out = ['']
for c in s:
    if seen != letters:
        out[-1]  = c
        seen.add(c)
    else:
        seen = set(c)
        out.append(c)
        

output: ['ban', 'anab', 'an']

CodePudding user response:

The logical way seens to be first create a set with all letters in your string, then go over teh original one, collecting each character, and startign a new collection each time the set of letters in the collection match the original.

def apart(string):
    target = set(string)
    result = []
    component = ""
    for char in string:
        component  = char
        if set(component) == target:
            result.append(component)
            component = ""
    if component:
        result.append(component)
    return result

CodePudding user response:

Using a set of the characters in the string, you can loop through the string and add or extend the last group in your resulting list:

S = "bananaban"
chars  = set(S)                    # distinct characters of string
groups = [""]                      # start with an empty group
for c in S:
    if chars.issubset(groups[-1]): # group contains all characters
        groups.append(c)           # start a new group
    else:
        groups[-1]  = c            # append character to last group
        
print(groups)
['ban', 'anab', 'an']
  •  Tags:  
  • Related