Home > OS >  appending first element of row in iterrows convert alpha to numeric equivelant?
appending first element of row in iterrows convert alpha to numeric equivelant?

Time:01-20

Please see my code below. I'm iterating through strings like '1A', '4D', etc, and I want the output to instead be 1.1, 4.4, and so on..see below.

Instead of 1A I want 1.1, 1B= 1.2, 4A = 4.1, 5D = 5.4, etc...

Convert alphabet letters to number in Python

data = ['1A','1B','4A', '5D','']
df = pd.DataFrame(data, columns = ['Score'])

newcol = []

for col, row in df['Score'].iteritems()
    if pd.isnull(row):
        newcol.append(row)       
    elif pd.notnull(row): 
        newcol.append(#FIRST ELEMENT OF ROW, 1-5,'.', 
                      #NUMERIC EQUIVALENT OF ALPHA, IE, A=1, B=2, C=3, D=4, etc)

CodePudding user response:

You can use str.replace:

df['Score'] = df['Score'].str.replace('\D',
              lambda x: f'.{ord(x.group(0).upper())-64}', regex=True)

output:

  Score
0   1.1
1   1.2
2   4.1
3   5.4
4      

CodePudding user response:

Use (with @Ch3steR's comment)-

from string import ascii_uppercase
dic = {j:str(i) for i,j in enumerate(ascii_uppercase, 1)}
df['Score'].str[0]   '.'   df['Score'].str[1].map(dic)

Output

0    1.1
1    1.2
2    4.1
3    5.4
4    NaN
Name: Score, dtype: object

CodePudding user response:

You could build mapping using str.maketrans and str.translate, a common recipe for mapping each character to it's output.

  • str.maketrans

    This static method returns a translation table usable for str.translate().

  • str.translate

    Return a copy of the s where all characters have been mapped through the map which must be a dictionary of Unicode ordinals (integers) to Unicode ordinals, strings or None. Unmapped characters are left untouched.

Use pd.Series.apply and pass str.translate to it.

from string import ascii_uppercase

table = str.maketrans({c: f'.{i}' for i, c in enumerate(ascii_uppercase, 1)})
df['Score'].apply(str.translate, args=(table, ))

# 0    1.1
# 1    1.2
# 2    4.1
# 3    5.4
# 4       
# Name: Score, dtype: object

Timeit results:

  • benchmarking setup
    # Million rows
    chars = np.arange(1_000_000).astype(str)   pd.Series([random.choice(ascii_uppercase) for _ in range(1_000_000)])
    df = pd.DataFrame({"Score": chars})  
    
  • Results
    @Ch3ster
    582 ms ± 4.42 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    @Mozway
    1.03 s ± 46.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    @Vivek
    Different output (as of this writing the posted answer
                      only works with a string of size two)
    

When df is large:

  • If execution time matters you could use maketrans translate solution.

When df is small (size less than 50K):

  • Both mozway's solution and maketrans almost take a similar time. maketrans being a slightly faster.
  •  Tags:  
  • Related