How to parse data on all rows, and use this row to populate other dataframes with data from multiple rows ?
I am trying to parse a csv file containing several data entry for training purpose as I am quite new to this technology.
My data consist in 10 columns, and hunderds of rows. The first column is filled with a code that is either 10, 50, or 90. Example :
Dataframe 1 :
| Code | Power/Character |
|---|---|
| 10 | Power-220 |
| 90 | End |
| 10 | Power-290 |
| 90 | End |
| 10 | Power-445 |
| 90 | End |
| 10 | Power-390 |
| 50 | Clotho |
| 50 | Kronus |
| 90 | End |
| 10 | Power-550 |
| 50 | Ares |
| 50 | Athena |
| 50 | Artemis |
| 50 | Demeter |
| 90 | End |
And the list goes on..
On one hand I want to be able to read the first cell, and to populate another dataframe directly if this is a code 10.
On the other hand, I'd like to populate another dataframe with all the codes 50s, but I want to be able to get the data from the previous code 10, as it hold the type of Power that is used, and populate a new column on this dataframe.
The new data frames are supposed to look like this:
Dataframe 2 :
| Code | Power |
|---|---|
| 10 | Power-220 |
| 10 | Power-290 |
| 10 | Power-445 |
| 10 | Power-390 |
| 10 | Power-550 |
Dataframe 3 :
| Code | Character | Power |
|---|---|---|
| 50 | Clotho | Power-390 |
| 50 | Kronus | Power-390 |
| 50 | Ares | Power-550 |
| 50 | Athena | Power-550 |
| 50 | Artemis | Power-550 |
| 50 | Demeter | Power-550 |
So far, I was using iterrows, and I've read everywhere that it was a bad idea.. but i'm struggling implementing another method..
In my code I just create two other dataframes, but I don't know yet a way to retrieve data from the previous cell. I would usually use a classic method, but I think it's rather archaic.
for index, row in df.iterrows():
if (df.iat[index,0] == '10'):
df2 = df2.append(df.loc[index], ignore_index = True)
if (df.iat[index,0] == '50'):
df3 = df3.append(df.loc[index], ignore_index = True)
Any ideas ?
CodePudding user response:
For df2, it's pretty simple:
df2 = df.rename(columns={'Power/Character': 'Power'}) \
.loc[df['Code'] == 10, :]
For df3, it's a bit more complex:
# Extract power and fill forward values
power = df.loc[df['Code'] == 10, 'Power/Character'].reindex(df.index).ffill()
df3 = df.rename(columns={'Power/Character': 'Character'}) \
.assign(Power=power).loc[lambda x: x['Code'] == 50]
Output:
>>> df2
Code Power
0 10 Power-220
2 10 Power-290
4 10 Power-445
6 10 Power-390
10 10 Power-550
>>> df3
Code Character Power
7 50 Clotho Power-390
8 50 Kronus Power-390
11 50 Ares Power-550
12 50 Athena Power-550
13 50 Artemis Power-550
14 50 Demeter Power-550
CodePudding user response:
You could simply copy the required rows to another dataframe,
df2 = df[df.col_1 == '10'].copy()
This will make a new dataframe df2 that contains only the rows from column col_1 that fits some criteria. The copy() function guarantees that the two dataframes are not identical, and changes in one do not affect the other.
If df2 already exists, you can concatenate them
df2 = pd.concat([df2, df[df.col_1 == '10'].copy()])
