I'm wondering how to sort a list as such:
list_1 = ['apple', 'grape', 'orange', 'orange', 'apple', 'grape', 'apple']
to this:
list_1 = [['apple', 'apple', 'apple'], ['grape', 'grape'], ['orange', 'orange']]
Only using a raw algorithm rather than any sort functions.
The reason why is that I'll use this algorithm in conjunction with a function that matches strings based on how similar the text is because the strings I'm sorting aren't the same, and relying on any sorting function wouldn't allow me to use that function
Any guidance and help is greatly appreciated!
CodePudding user response:
You can use this kind of algorithm. First sort the list using any sorting algorithms, let us use bubble sort for now. Then count the frequency of all elements and store it in a dictionary and then and convert it back into a nested list. So you can do something like this:
def bubble_sort(ls):
swapped = True
while swapped:
swapped = False
for i in range(len(ls) - 1):
if ls[i] > ls[i 1]:
# Swap the elements
ls[i], ls[i 1] = ls[i 1], ls[i]
swapped = True
return ls
def nest(ls):
ls2 = {}
res = []
for i in ls:
if i in ls2:
ls2[i] = 1
else:
ls2[i] = 1
i = 0
while i<len(ls):
res.append([ls[i]]*ls2[ls[i]])
i =ls2[ls[i]]
return res
print(nest(bubble_sort(['apple', 'grape', 'orange', 'orange', 'apple', 'grape', 'apple'])))
CodePudding user response:
To sort this list you can run this code:
list_1 = ['apple', 'grape', 'orange', 'orange', 'apple', 'grape', 'apple']
list_1.sort()
dictionary = {}
for item in list_1:
if item not in dictionary:
dictionary[item] = 1
else:
dictionary[item] = 1
list_1 = []
for word in dictionary:
list_2 = [word]*dictionary[word]
list_1.append(list_2)
print(list_1)
This outputs
[['apple', 'apple', 'apple'], ['grape', 'grape'], ['orange', 'orange']]
The second line of code is optional, you can remove it if you don't need the last list to be in alphabetical order.
CodePudding user response:
Assuming that your word similarity function is transitive, you can progressively add words to the group they belong to, adding new groups as you go for words that don't find any similarity:
def groupWords(L,isSimilar=lambda a,b:a==b):
result = []
for word in L: # for each word in list
for group in result: # find a similar word group
if isSimilar(word,group[0]):
group.append(word) # add to the group
break
else:
result.append([word]) # new group if no match
return result
output:
list_1 = ['apple', 'grape', 'orange', 'orange', 'apple', 'grape', 'apple']
list_2 = groupWords(list_1)
print(list_2)
[['apple', 'apple', 'apple'], ['grape', 'grape'], ['orange', 'orange']]
You can supply the function with your own similarity check function or make it part of the loop directly
Note that, if your similarity check is not transitive, words will go into the first matching group and the result will be dependent on the order of the input
