I have 3 excels A,B and C with some data as below.
Characteristics of data in Excel A,B
Each title is unique
For each title, there is a corresponding ID. Please find sample data below.
Excel A Excel B > > **New_Title** **New_ID** **Existing_Title** **Existing_ID** > Title1 W-1 Title3 W-3 > Title2 W-2 Title4 W-4Excel C
> > **IncidentTitle** > Title1 Title1 Title1 Title2 Title2 Title3 Title4
Excel C can have multiple rows with same title.
I am trying to compare data in A, B to C and write ID column to excel C. So that for each row that has same Title, it gets written with respective ID. For eg: Title1 is appeared 3 times in Excel C, so it's ID: W-1 has to be written thrice in the Excel C and so on. I have tried with some approaches but they were unsuccessful. Any help in identifying where I am going wrong is appreciated.
for i in range(length):
\\Omitted some lines of code here.
for j in range(c_len): //length of excel C.
Title = id_map.at[j, 'IncidentTitle'] //id_map is the dataframe containing the data of Excel C
if Title == New_Title:
Jira_ID.insert(j, New_ID)
elif Title == Existing_Title:
Jira_ID.insert(j, Existing_ID)
else:
continue // Can be any placeholder.
CodePudding user response:
Defining the dataframes:
dfA = pd.DataFrame({
'New_Title': ['Title1', 'Title2'],
'New_ID': ['W-1', 'W-2'],
})
dfB = pd.DataFrame({
'Existing_Title': ['Title3', 'Title4'],
'Existing_ID': ['W-3', 'W-4'],
})
dfC = pd.DataFrame({
'IncidentTitle': ['Title1', 'Title1', 'Title1', 'Title2', 'Title2', 'Title3', 'Title4']
})
First we can union the dfA and dfB by pd.concat,
dfAB = pd.concat([
dfA.rename(columns={'New_Title': 'Title', 'New_ID': 'ID'}),
dfB.rename(columns={'Existing_Title': 'Title', 'Existing_ID': 'ID'}),
])
print(dfAB)
Title ID
0 Title1 W-1
1 Title2 W-2
0 Title3 W-3
1 Title4 W-4
Then one of the pandas ways to do your job is by merge, it finds matching keys among the two dataframes dfC and dfAB, where 'IncidentTitle' is used for dfC and 'Title' is used for dfAB, and in case there is a match it merges data from dfAB to dfC (this merging direction is by using how='left')
dfC.merge(dfAB, left_on='IncidentTitle', right_on='Title', how='left').drop(columns='Title')
Final outcome:
IncidentTitle ID
0 Title1 W-1
1 Title1 W-1
2 Title1 W-1
3 Title2 W-2
4 Title2 W-2
5 Title3 W-3
6 Title4 W-4
Note that if you have duplicated Titles in dfAB, the outcome may not be as expected.
