Home > Blockchain >  Can I name 2 subsequent columns identical like the previous one? (python)
Can I name 2 subsequent columns identical like the previous one? (python)

Time:01-29

I have a dataset in Excel and some columns are unnamed there: A_ColumnWithName, Unnamed1, Unnamed2, B_ColumnWithName, Unnamed3, Unnamed4

I need to set the names of the currently unnamed columns to the same as the 1st column name from the left, so my columns should look like this:

A_ColumnWithName, A_ColumnWithName, A_ColumnWithName, B_ColumnWithName, B_ColumnWithName, B_ColumnWithName

Any hints how can I do it using Python? An important thing is that there is tons of such columns that's why it's required to do in the most automatic way possible.

CodePudding user response:

You can convert the index to Series and use it to mask the names that contain "Unnamed" and ffill the previous valid name:

cols = df.columns.to_series()
df.columns = cols.mask(cols.str.contains('Unnamed')).ffill()

Note however that having duplicated columns names is not encouraged

example input:

   A Unnamed1 Unnamed2  B Unnamed3  C
0  x        x        x  x        x  x

output:

   A  A  A  B  B  C
0  x  x  x  x  x  x

CodePudding user response:

The following code would work.

import pandas as pd
df = pd.DataFrame( columns= ["A_ColumnWithName", "unnamed","unnamed", "B_ColumnWithName", "unnamed", "unnamed"])

replaceWith = df.columns.values[0]
for i in range(1, len(df.columns)):
    if df.columns[i] == 'unnamed':
        df.columns.values[i] = replaceWith
    else:
        replaceWith = df.columns.values[i]
  •  Tags:  
  • Related