I need to cut unwanted txt from string in my DataFrame-CodePudding

Hi I need to cut unwanted txt from one string in my DataFrame.

df looks like this:

                                A        B      C
0                             526      204  40.88 
1                             177      173  59.25
2                             196      228  47.24
3  1.0 1393 Name: KP, dtype: int64     155  52.83

In 3 raw column A i need to leave only 1393, everything else need to be cut/deleted.

This is code of function that make dictiorany. This dictionary i append as last row to df.

    final_list = {}
    
    for i in list_of_column:
        temp_list = []

        if i == 'KP':
            temp_series = df3['KP'].where(df3['KP'] == 1).dropna().value_counts()
            temp_list.append(temp_series)

        else:
            if i == "KY1":
                temp_list.append(round(hybryd_mean(i,df1),0))
            elif i != "Wiek":
                temp_list.append(round(hybryd_mean(i,df1),2))
            else:
                temp_list.append(hybryd_mean(i,df1))
        final_list[i] = temp_list
    return final_list

and here is hybrid_mean function used in hybrid_dic:

    rslt_df = df1[name_of_column].mean()
    
    return rslt_df

CodePudding user response：

Assuming dtype for this column is str:

df.A = df.A.apply(lambda x: x if len(x) <= 4 else x.split(' ')[1])

CodePudding user response：

data = [[526, 204, 40.88],
        [177, 173, 59.25],
        [196, 228, 47.24],
        ['1.0 1393 Name: KP, dtype: int64', 155, 52.83]]
df = pd.DataFrame(data, columns=['A', 'B', 'C'])

def process(x):
    pattern = re.compile(r'[ ][\d] ')
    if isinstance(x, str):
        result = re.search(pattern, x)
        if result:
            x = result.group().strip()
    return x


df['A'] = df['A'].apply(lambda x: process(x))
df

Output:
    A       B       C
0   526     204     40.88
1   177     173     59.25
2   196     228     47.24
3   1393    155     52.83