I have a dataframe with a column with item names and a column with a number. I would like to create a list with the item names repeated the number of times in the column.
| item | number |
|---|---|
| cat | 2 |
| dog | 3 |
| parrot | 4 |
My desired output is
| item |
|---|
| cat |
| cat |
| dog |
| dog |
| dog |
| parrot |
| parrot |
| parrot |
| parrot |
I feel like I'm quite close with this code:
for index in df.iterrows():
for x in range(2):
print(df.item)
However, I can't find a way to replace 2 in range with the number out of the dataframe. df.numbers doesn't seem to work.
CodePudding user response:
As you said, your desired output is a list, using @Michael's comment, you can do this:
list(df.item.repeat(df.number))
The output would be:
['cat', 'cat', 'dog', 'dog', 'dog', 'parrot', 'parrot', 'parrot', 'parrot']
CodePudding user response:
If you were keen on using .iterrows() then you might do something like:
import pandas
df = pandas.DataFrame([
{"item": "cat", "number": 2},
{"item": "dog", "number": 3},
{"item": "parrot", "number": 4},
])
new_list = []
for index, row in df.iterrows():
new_list.extend([row["item"]] * row["number"])
print(new_list)
This relies on the fact that ["x"] * 3 === ["x", "x", "x"]
