I would like to get only codes between # and concat it to a new column.
What I have
| id | code |
|---|---|
| 0 | (#M05Q01900R00100# = 1) AND (#M05Q01950R00200# = 0) |
| 1 | (#M05Q01900R00100# = 1) AND ((#M05Q01950R00100# = 0) OR (#M05Q01950R00200# = 0)) |
| 2 | (#M05Q01600R00100# = 1) |
| 3 | (#M05Q01125R00200# = 1) |
| 4 | (#M05Q01129R00100# = 1) AND (#M05Q01130R00300# = 0) |
| 5 | (#M05Q01130R00400# = 1) |
| 6 | (#M05Q01132R00100# = 1) |
| 7 | (#M05Q01132R00400# = 1) |
| 8 | (#M05Q01132R00400# = 1) |
| 9 | (#M05Q01140R00200# = 1) |
What I would like to get
| id | code | concat |
|---|---|---|
| 0 | (#M05Q01900R00100# = 1) AND (#M05Q01950R00200# = 0) | M05Q01900R00100, M05Q01950R00200 |
| 1 | (#M05Q01900R00100# = 1) AND ((#M05Q01950R00100# = 0) OR (#M05Q01950R00200# = 0)) | M05Q01900R00100, M05Q01950R00100, M05Q01950R00200 |
| 2 | (#M05Q01600R00100# = 1) | M05Q01600R00100 |
| 3 | (#M05Q01125R00200# = 1) | M05Q01125R00200 |
| 4 | (#M05Q01129R00100# = 1) AND (#M05Q01130R00300# = 0) | M05Q01129R00100, M05Q01130R00300 |
| 5 | (#M05Q01130R00400# = 1) | M05Q01130R00400 |
| 6 | (#M05Q01132R00100# = 1) | M05Q01132R00100 |
| 7 | (#M05Q01132R00400# = 1) | M05Q01132R00400 |
| 8 | (#M05Q01132R00400# = 1) | M05Q01132R00400 |
| 9 | (#M05Q01140R00200# = 1) | M05Q01140R00200 |
CodePudding user response:
Use Series.str.findall with regex for values between # and then Series.str.join:
df['concat'] = df['code'].str.findall(r'#(.*?)#').str.join(', ')
print (df)
id code \
0 0 (#M05Q01900R00100# = 1) AND (#M05Q01950R00200#...
1 1 (#M05Q01900R00100# = 1) AND ((#M05Q01950R00100...
2 2 (#M05Q01600R00100# = 1)
3 3 (#M05Q01125R00200# = 1)
4 4 (#M05Q01129R00100# = 1) AND (#M05Q01130R00300#...
5 5 (#M05Q01130R00400# = 1)
6 6 (#M05Q01132R00100# = 1)
7 7 (#M05Q01132R00400# = 1)
8 8 (#M05Q01132R00400# = 1)
9 9 (#M05Q01140R00200# = 1)
concat
0 M05Q01900R00100, M05Q01950R00200
1 M05Q01900R00100, M05Q01950R00100, M05Q01950R00200
2 M05Q01600R00100
3 M05Q01125R00200
4 M05Q01129R00100, M05Q01130R00300
5 M05Q01130R00400
6 M05Q01132R00100
7 M05Q01132R00400
8 M05Q01132R00400
9 M05Q01140R00200
