I'm working as a python newbie and having following problem. Source Table:
Timestamp Value
0 2022-01-31T23:00:37Z 79
1 2022-01-31T23:00:38Z 80
2 2022-01-31T23:00:39Z 79
3 2022-01-31T23:00:46Z 79
4 2022-01-31T23:00:47Z 80
... ... ...
17181 2022-02-01T22:56:54Z 79
17182 2022-02-01T22:59:16Z 79
17183 2022-02-01T22:59:17Z 80
17184 2022-02-01T22:59:18Z 80
17185 2022-02-01T22:59:19Z 79
[17186 rows x 2 columns]
And want to divide it by a function with following code:
MAX = 79
SCALLING = 100
try:
cond = result['Value']>MAX
result.loc[cond,'Value'] = result['Value'].div(SCALLING).round(2)
except:
result = result
print(result)
The function want divide the Values. I tried itterating over the dataframe by code:
MAX = 79
SCALLING = 100
for i in result.index:
cond = result['Value']>MAX
result.loc[cond,'Value'] = result['Value'].div(SCALLING).round(2)
print(result)
But then I got the 'TypeError' : '>' not supported between instances of 'dict' and 'int'. Probably because result.loc[cond,'Value'] = dictonary datatype, how can I specify the specific value?
CodePudding user response:
Your initial code, though not extremely "pythonic", actually works just fine with the data you've provided. E.g.:
import pandas as pd
data = {'Timestamp': {0: '2022-01-31T23:00:37Z',
1: '2022-01-31T23:00:38Z',
2: '2022-01-31T23:00:39Z',
3: '2022-01-31T23:00:46Z',
4: '2022-01-31T23:00:47Z'},
'Value': {0: 79,
1: 80,
2: 79,
3: 79,
4: 80 }}
result = pd.DataFrame(data)
print(result)
Timestamp Value
0 2022-01-31T23:00:37Z 79
1 2022-01-31T23:00:38Z 80
2 2022-01-31T23:00:39Z 79
3 2022-01-31T23:00:46Z 79
4 2022-01-31T23:00:47Z 80
MAX = 79
SCALLING = 100
try:
cond = result['Value']>MAX
result.loc[cond,'Value'] = result['Value'].div(SCALLING).round(2)
except:
result = result
print(result)
Timestamp Value
0 2022-01-31T23:00:37Z 79.0
1 2022-01-31T23:00:38Z 0.8
2 2022-01-31T23:00:39Z 79.0
3 2022-01-31T23:00:46Z 79.0
4 2022-01-31T23:00:47Z 0.8
However, this error message:
'TypeError' : '>' not supported between instances of 'dict' and 'int'`
means that you have one or more dict values somewhere in result.Value.
E.g.:
data = {'Timestamp': {0: '2022-01-31T23:00:37Z',
1: '2022-01-31T23:00:38Z',
2: '2022-01-31T23:00:39Z',
3: '2022-01-31T23:00:46Z',
4: '2022-01-31T23:00:47Z'},
'Value': {0: {'Value': 79},
1: 80,
2: 79,
3: 79,
4: 80 }}
result = pd.DataFrame(data)
print(result)
Timestamp Value
0 2022-01-31T23:00:37Z {'Value': 79}
1 2022-01-31T23:00:38Z 80
2 2022-01-31T23:00:39Z 79
3 2022-01-31T23:00:46Z 79
4 2022-01-31T23:00:47Z 80
result['Value']>MAX would haved raised the aforementioned TypeError, if not for the Try ... Except construction.
So, the remedy is to find the dict values in your column, and deal with them. To locate them, you could use:
dict_values = result[result.Value.map(type)==dict]
print(dict_values)
Timestamp Value
0 2022-01-31T23:00:37Z {'Value': 79}
CodePudding user response:
Here's an answer if your goal is to divide all value's by SCALLING if they are larger than MAX. I don't know how you got a dict error. Maybe you have a dict called MAX somewhere?
import pandas as pd
import io
#Read in your example table
result = pd.read_csv(
io.StringIO("""
Timestamp Value
0 2022-01-31T23:00:37Z 79
1 2022-01-31T23:00:38Z 80
2 2022-01-31T23:00:39Z 79
3 2022-01-31T23:00:46Z 79
4 2022-01-31T23:00:47Z 80
17181 2022-02-01T22:56:54Z 79
17182 2022-02-01T22:59:16Z 79
17183 2022-02-01T22:59:17Z 80
17184 2022-02-01T22:59:18Z 80
17185 2022-02-01T22:59:19Z 79
"""),
delim_whitespace=True,
index_col=0,
parse_dates=['Timestamp'],
)
#Divide all value's by SCALLING if they are larger than MAX
MAX = 79
SCALLING = 100
cond = result['Value']>MAX
result.loc[cond,'Value'] = result.loc[cond,'Value'].div(SCALLING).round(2)
print(result)
Output

