Hi I need to cut unwanted txt from one string in my DataFrame.
df looks like this:
A B C
0 526 204 40.88
1 177 173 59.25
2 196 228 47.24
3 1.0 1393 Name: KP, dtype: int64 155 52.83
In 3 raw column A i need to leave only 1393, everything else need to be cut/deleted.
This is code of function that make dictiorany. This dictionary i append as last row to df.
final_list = {}
for i in list_of_column:
temp_list = []
if i == 'KP':
temp_series = df3['KP'].where(df3['KP'] == 1).dropna().value_counts()
temp_list.append(temp_series)
else:
if i == "KY1":
temp_list.append(round(hybryd_mean(i,df1),0))
elif i != "Wiek":
temp_list.append(round(hybryd_mean(i,df1),2))
else:
temp_list.append(hybryd_mean(i,df1))
final_list[i] = temp_list
return final_list
and here is hybrid_mean function used in hybrid_dic:
rslt_df = df1[name_of_column].mean()
return rslt_df
CodePudding user response:
Assuming dtype for this column is str:
df.A = df.A.apply(lambda x: x if len(x) <= 4 else x.split(' ')[1])
CodePudding user response:
data = [[526, 204, 40.88],
[177, 173, 59.25],
[196, 228, 47.24],
['1.0 1393 Name: KP, dtype: int64', 155, 52.83]]
df = pd.DataFrame(data, columns=['A', 'B', 'C'])
def process(x):
pattern = re.compile(r'[ ][\d] ')
if isinstance(x, str):
result = re.search(pattern, x)
if result:
x = result.group().strip()
return x
df['A'] = df['A'].apply(lambda x: process(x))
df
Output:
A B C
0 526 204 40.88
1 177 173 59.25
2 196 228 47.24
3 1393 155 52.83
