How to append the last 4 letters of every item in a dataframe column to a list-CodePudding

I have an inventory system with two separate SKUs in a single string, example (FT3M-FL3M), I want to make a list called "first_item" and append the first 4 letters (first SKU) to it and do this for every item in the column. Then do the same but with taking the last 4 digits of each string (second SKU)

This is the example dataframe

| Index    | Skus           |
| -------- | -------------- |
| First    | FT3M-FL3M      |
| Second   | FT2M-FL2M      |
| Second   | FT1M-FB1M      |
| Second   | FT4S-FL4S      |
| Second   | FT4S-FB2M      |

This is the code I have:

first_list = []
for i in sku:
first_list.append(i[0:4])
print(first_list)

It clearly doesn't work and I get a massive list.

CodePudding user response：

If you need a list of first / last n characters of a string inside a pandas column you can try:

# first
my_first = [x[:4] for x in df.wanted_column]

# last
my_last = [x[-4:] for x in df.wanted_column]

This gives you a possibility to filter the list on the fly if needed, for example:

my_filtered = my_first = [x[:4] for x in df.wanted_column if 'some_sku' not in x]

CodePudding user response：

Assuming you have a pandas DataFrame, you can use pandas methods to get your desired outcome. Ue str.split method on 'Skus' column while passing True to the expand parameter so that you split the strings into separate columns.

skus_split = df['Skus'].str.split('-', expand=True)

Then skus_split is a DataFrame that looks like:

      0     1
1  FT3M  FL3M
2  FT2M  FL2M
3  FT1M  FB1M
4  FT4S  FL4S
5  FT4S  FB2M

Note that the column 0 is the part before '-' for each string and column 1 is the part after '-' for each string.

Then to get your desired lists, select each column and use tolist to convert to lists:

first_list = skus_split[0].tolist()
second_list = skus_split[1].tolist()

Then first_list looks like:

['FT3M', 'FT2M', 'FT1M', 'FT4S', 'FT4S']

and second_list:

['FL3M', 'FL2M', 'FB1M', 'FL4S', 'FB2M']