| ref_area.label | time | cat | obs_value |
|---|---|---|---|
| France | 2018 | old | 26575.796 |
| France | 2018 | worker | 75067.714 |
| France | 2019 | old | 27118.034 |
| France | 2019 | worker | 74956.094 |
| Hong Kong, China | 2018 | old | 2488.000 |
| Hong Kong, China | 2018 | worker | 9912.692 |
| Hong Kong, China | 2019 | old | 2602.402 |
| Hong Kong, China | 2019 | worker | 9882.588 |
I have the following dataframe. How can I find the ratio of workers/old for each country and time?
CodePudding user response:
Simplier is divide pivoted table by DataFrame.pivot:
df1 = df.pivot(['ref_area.label','time'], 'cat','obs_value')
df1['ratio'] = df1['worker'].div(df1['old'])
df1 = df1.reset_index().rename_axis(None, axis=1)
print (df1)
ref_area.label time old worker ratio
0 France 2018 26575.796 75067.714 2.824665
1 France 2019 27118.034 74956.094 2.764068
2 Hong Kong, China 2018 2488.000 9912.692 3.984201
3 Hong Kong, China 2019 2602.402 9882.588 3.797487
If need original ouput add DataFrame.stack:
df1 = df.pivot(['ref_area.label','time'], 'cat','obs_value')
df1['ratio'] = df1['worker'].div(df1['old'])
df1 = df1.stack().reset_index(name='obs_value')
print (df1)
ref_area.label time cat obs_value
0 France 2018 old 26575.796000
1 France 2018 worker 75067.714000
2 France 2018 ratio 2.824665
3 France 2019 old 27118.034000
4 France 2019 worker 74956.094000
5 France 2019 ratio 2.764068
6 Hong Kong, China 2018 old 2488.000000
7 Hong Kong, China 2018 worker 9912.692000
8 Hong Kong, China 2018 ratio 3.984201
9 Hong Kong, China 2019 old 2602.402000
10 Hong Kong, China 2019 worker 9882.588000
11 Hong Kong, China 2019 ratio 3.797487
