I have dataframe eg. like below
Event['EVENT_ID'] = [ 4162, 4161, 4160, 4159,4158, 4157, 4156, 4155, 4154]
need to convert each row word to binary.
Event['b']=bin(Event['EVENT_ID']) doesn't work
TypeError: cannot convert the series to <class 'int'>
expected new column with binary, remove 0b and split the column to 16 separate column
bin(4162) = '0b1000001000010'
CodePudding user response:
I don't think using the bin function and working with the str type is particularly efficient. Please consider using bitmasks.
for i in range(16):
df[f"bit{i}"] = df["EVENT_ID"].apply(lambda x: x & 1 << i).astype(bool).astype(int)
Testing with your data, I have the following results
EVENT_ID B bit0 bit1 bit2 bit3 bit4 bit5 bit6 bit7 \
0 4162 1000001000010 0 1 0 0 0 0 1 0
1 4161 1000001000001 1 0 0 0 0 0 1 0
2 4160 1000001000000 0 0 0 0 0 0 1 0
3 4159 1000000111111 1 1 1 1 1 1 0 0
4 4158 1000000111110 0 1 1 1 1 1 0 0
5 4157 1000000111101 1 0 1 1 1 1 0 0
6 4156 1000000111100 0 0 1 1 1 1 0 0
7 4155 1000000111011 1 1 0 1 1 1 0 0
8 4154 1000000111010 0 1 0 1 1 1 0 0
bit8 bit9 bit10 bit11 bit12 bit13 bit14 bit15
0 0 0 0 0 1 0 0 0
1 0 0 0 0 1 0 0 0
2 0 0 0 0 1 0 0 0
3 0 0 0 0 1 0 0 0
4 0 0 0 0 1 0 0 0
5 0 0 0 0 1 0 0 0
6 0 0 0 0 1 0 0 0
7 0 0 0 0 1 0 0 0
8 0 0 0 0 1 0 0 0
CodePudding user response:
Try this
Event = pd.DataFrame({'EVENT_ID': [ 4162, 4161, 4160, 4159,4158, 4157, 4156, 4155, 4154]})
# use bin function in a list comprehension and convert each bit to a separate column
binary_values = pd.DataFrame([list(bin(x)[2:]) for x in Event['EVENT_ID']])
join the new df to Event
df = Event.join(binary_values)
print(df)
EVENT_ID 0 1 2 3 4 5 6 7 8 9 10 11 12
0 4162 1 0 0 0 0 0 1 0 0 0 0 1 0
1 4161 1 0 0 0 0 0 1 0 0 0 0 0 1
2 4160 1 0 0 0 0 0 1 0 0 0 0 0 0
3 4159 1 0 0 0 0 0 0 1 1 1 1 1 1
4 4158 1 0 0 0 0 0 0 1 1 1 1 1 0
5 4157 1 0 0 0 0 0 0 1 1 1 1 0 1
6 4156 1 0 0 0 0 0 0 1 1 1 1 0 0
7 4155 1 0 0 0 0 0 0 1 1 1 0 1 1
8 4154 1 0 0 0 0 0 0 1 1 1 0 1 0
CodePudding user response:
You can try turn the int Series to binary by applying '{0:b}'.format, then split the list column to multiple columns with pd.DataFrame
df = df.join(pd.DataFrame(df['EVENT_ID'].apply('{0:b}'.format).apply(list).tolist()))
print(df)
EVENT_ID 0 1 2 3 4 5 6 7 8 9 10 11 12
0 4162 1 0 0 0 0 0 1 0 0 0 0 1 0
1 4161 1 0 0 0 0 0 1 0 0 0 0 0 1
2 4160 1 0 0 0 0 0 1 0 0 0 0 0 0
3 4159 1 0 0 0 0 0 0 1 1 1 1 1 1
4 4158 1 0 0 0 0 0 0 1 1 1 1 1 0
5 4157 1 0 0 0 0 0 0 1 1 1 1 0 1
6 4156 1 0 0 0 0 0 0 1 1 1 1 0 0
7 4155 1 0 0 0 0 0 0 1 1 1 0 1 1
8 4154 1 0 0 0 0 0 0 1 1 1 0 1 0
