Assigning a value in a column based on unique values in another

I have 2 columns, column A has many string values, some unique, and some repeat several times in the column. Column B has either 1 or 0. Some unique values have only an equivalent zero in column B and some have only 1, and for some, it may differ between 1 and zero in different rows. I'd like to 'override' the zeroes by checking if a value in column A has 1 in column B, look for rows where the same value equals zero and replace it with 1. I have a variable with all values that equal 1. If possible I'd like to avoid for loop with the iterrows method which would probably be the immediate suspect:

is_1=data.query('is_1==1')
A_unique=is_1['A'].unique()
for index, row in data.iterrows():
    if row['is_1']==0:
        if row['A'] in A_unique:
            data.loc[data.A==row['A'],'is_1']=1

CodePudding user response：

One way I can think of is to sort and use the fillna method to forward fill the zeros -

df = pd.DataFrame({'A': list('ABBABCB'), 'B': list('0100011')})
#   A  B
#0  A  0
#1  B  1
#2  B  0
#3  A  0
#4  B  0
#5  C  1
#6  B  1

# First we replace all 0's with nan's
df.loc[df['B'] == '0', 'B'] = np.nan
# Then we sort and fillna
df = df.sort_values(['A', 'B']).fillna(method="ffill").fillna('0')
#   A  B
#0  A  0
#3  A  0
#1  B  1
#6  B  1
#2  B  1
#4  B  1
#5  C  1

CodePudding user response：

This could be a solution as well using list comprehension

df = pd.DataFrame({
'a': ['str1', 'str2', 'str3', 'str1', 'str1', 'str1', 'str4', 'str4'],
'b': [0, 1, 0, 1, 0, 1, 0, 1]})

 #   a    b 
 #0  str1 0
 #1  str2 1
 #2  str3 0
 #3  str1 1
 #4  str1 0
 #5  str1 1 
 #6  str4 0
 #7  str4 1

tup_list = [(j, 1) if (j, 1) in zip(df['a'], df['b']) else (j, i) for(j, i) in zip(df['a'], df['b'])]

df = pd.DataFrame(tup_list, columns=['a', 'b'])
 
 #  a    b
 #0 str1 1
 #1 str2 1
 #2 str3 0
 #3 str1 1
 #4 str1 1
 #5 str1 1
 #6 str4 1
 #7 str4 1