Home > Enterprise >  How to Iterate all the rows in dataframe and returning the results for all the rows?
How to Iterate all the rows in dataframe and returning the results for all the rows?

Time:01-28

I do have 1 column and 3 rows in dataframe. The dataframe is below

    Text
0   Provided by Hindustan Times Wuhan Institute of...
1   Kattappa continues to narrate how he ended up ...
2   National Commercial Bank (NCB), Saudi Arabia’s...

I'm trying to summarize all the 3 rows and want to create another column like

    Text                                               Summarize
0   Provided by Hindustan Times Wuhan Institute of...   It's related to virus
1   Kattappa continues to narrate how he ended up ...   It's a movie story
2   National Commercial Bank (NCB), Saudi Arabia’s...   Article related to finance

I tried the below code

for index, row in df.iterrows():
    
    chunks = generate_chunks(row['Text'])
    
    res = summarizer(chunks, max_length=1000, min_length=20)

    text = ' '.join([summ['summary_text'] for summ in res])

print(text)

But the output is

Article related to finance

Can anyone help me with this?

CodePudding user response:

You overwrite the value of text at each iteration - so it gets changed to "It's related to virus", then changed to "It's a movie story" and the previous value forgotten, and finally changed to "Article related to finance" and both the previous values forgotten.

Instead of using a single string, use a list of strings and append to it at each iteration, like this:

summaries = []
for index, row in df.iterrows():
    chunks = generate_chunks(row['Text'])
    res = summarizer(chunks, max_length=1000, min_length=20)
    text = ' '.join([summ['summary_text'] for summ in res])
    summaries.append(text)

print(summaries)
  •  Tags:  
  • Related