when we use dataset with pandas.dataframe(), sometimes labels categories are not same ratio.
example) bike: car = 7:3
| price | label |
|---|---|
| 200 | bike |
| 100 | bike |
| 700 | bike |
| 300 | bike |
| 5500 | car |
| 400 | bike |
| 5200 | car |
| 310 | bike |
| 2000 | car |
| 20 | bike |
In this case, car and bike are not same ratio. so, I want to make each category to be in same ratios.
car shows only 3 times, so 4 bike rows are deleted like this...
| price | label |
|---|---|
| 200 | bike |
| 300 | bike |
| 5500 | car |
| 5200 | car |
| 2000 | car |
| 20 | bike |
order is not important. I just want to get same ratio categories.
I did count car labels and bike labels, and check fewer labels(In this time, car is fewer labels), and read each rows to move another dataframe. It takes a lot of time, so Inconvenience.
Is there a easiest way to make number of labels equal with pandas dataframe? or just count each label and make another dataframe?
Thank you.
CodePudding user response:

