Merge three columns into one taking into account priority preference-CodePudding

I have the following DataFrame in pandas:

code	town	village	city
01	Brunete	NaN	NaN
02	NaN	Cabrera d'Anoia	NaN
03	NaN	NaN	Barcelona
04	Zarzalejo	NaN	Madrid
07	Melilla	NaN	City of Melilla
08	Cartagena	Galifa	Region of Murcia

I want to merge the columns town, city and village into one. In case more than 1 column contains a value other than NaN, the resulting column will be assigned the value in this order of preference: village > town > city. Example:

code	merged_column
01	Brunete
02	Cabrera d'Anoia
03	Barcelona
04	Zarzalejo
07	Melilla
08	Galifa

CodePudding user response：

Let us define the order of columns then select those columns and use backfill along columns axis

order = ['village', 'town', 'city']
df['merged'] = df[order].bfill(axis=1).iloc[:, 0]

   code       town          village              city           merged
0     1    Brunete              NaN               NaN          Brunete
1     2        NaN  Cabrera d'Anoia               NaN  Cabrera d'Anoia
2     3        NaN              NaN         Barcelona        Barcelona
3     4  Zarzalejo              NaN            Madrid        Zarzalejo
4     7    Melilla              NaN           Melilla          Melilla
5     8  Cartagena           Galifa  Region of Murcia           Galifa

CodePudding user response：

What you want is a coalesce pandas version. Here you can find a related question to this.

An alternative method is the combine_first method, which can be translated to the sql-coalesce function:

df = pd.DataFrame(
    {"code": ["01", "02", "03", "04", "07", "08"],
    "town": ["Brunete", pd.NA, pd.NA, "Zarzalejo", "Melilla", "Cartagena"],
    "village": [pd.NA, "Cabrera d'Anoia", pd.NA, pd.NA, pd.NA, "Galifa"],
    "city": [pd.NA, pd.NA, "Barcelona", "Madrid", "Melilla", "Region of Murcia"]}
)

df["village"].combine_first(df["town"]).combine_first(df["city"])

----------------------------
0            Brunete
1    Cabrera d'Anoia
2          Barcelona
3          Zarzalejo
4            Melilla
5             Galifa
Name: village, dtype: object
-----------------------------