how to add to an empty list each time a condition is met in a for loop python?-CodePudding

so I have a for loop that loops through countries and each country has either a yes or a no, I want the corresponding animal to be added to a list each time there is a yes triggered. For example, I have a list that goes

Countries = ['Germany','France'..etc etc]

my DF is something like this

animal  Germany  France  
Rabbit    yes       yes
Bear      no        yes
...

I want a list of animals such that there is a yes for the countries selected in the countries list. So in the instance above, I would want

animal_list = [Rabbit, Rabbit, Bear]

and my main code goes something like this, I have my attempt below as well but it doesn't work. Is there a clean way of doing it?

 Countries = ['Germany','France'..etc etc]
 animals_list = []
 for country in Countries:
   animal_list = animal_list.append(df[df[country] == 'yes'],'animal'])

The for loop is required so I am unable to do it off the bat using pandas.

CodePudding user response：

Considering you have a Dataframe like this

data = {'animal':['Rabbit', 'Bear'],
    'Germany':['yes', 'no'],
    'France': ['yes', 'no']
   }
df = pd.DataFrame(data)

If the wanted countries are given in a list:

# In Python, Try to use lowercase, underscore seperated names for your variables (PEP8)

countries = ['Germany', 'France']

Then you can select those columns:

# Select the countries that you want
df_subset = df[df.columns.intersection(countries)]

And calculate number of yes for each animal:

animals_index_to_num_yes = df_subset.eq('yes').sum(axis=1)

In this way the list can be created very easily:

animals_list = []

for index, animal in df['animal'].iteritems():
    occurences = animals_index_to_num_yes.get(index)
    animals_list.extend(
        [animal] * occurrences
    )

Notes:

Try to avoid for loops in Pandas as much as possible, in general, built-in methods will have a better performance because of the use of concurrency. See this excellent answer for more.
In your case, as the order of the animals in the output list matters, I'm not sure if the loop can be avoided, therefore I used a for loop.

CodePudding user response：

You could iterate over the animals and for each one, count how many times the rest of that row contains yes, then append as many of that animal to the list:

animals_list = []

for i, animal in enumerate(df.animal):
    n = sum(df.iloc[i, 1:] == 'yes')
    animals_list.extend([animal] * n)

CodePudding user response：

animals_list=[]
country_list=['germany','france']

for i in range(len(df)):
    for country in country_list:
        if df[country].iloc[i]=='yes':
            animals_list.append(df.animal.iloc[i])

print(animal_list)

Output : ['rabbit', 'rabbit', 'bear']