I'm trying to categorize an example table to 3 category using apply and lambda but the result is always mature, I dont get it so here is the code :
name=["rian","nancy","intan","rubim"]
age=["30","20","20","12"]
gender=["male","female","female","male"]
df = pd.DataFrame({
"name":name,
"age":age,
"gender":gender
})
def category (age):
if age<20:
return 'kids'
elif umur==20:
return 'youth'
else:
return 'mature'
df['category']=df['age'].apply(age)
CodePudding user response:
There are several issues with your code:
- you have to call
applywith the function name,category - the
categoryfunction references an undefined variableumur. Should this beage? - Your
agedata is of type string, but you want to compare it numerically.
Fixing these problems gives:
name = ["rian", "nancy", "intan", "rubim"]
age = [30, 20, 20, 12]
gender = ["male", "female", "female", "male"]
df = pd.DataFrame({
"name": name,
"age": age,
"gender": gender
})
def category(age):
if age < 20:
return 'kids'
elif age == 20:
return 'youth'
else:
return 'mature'
df['category'] = df['age'].apply(category)
with the resulting DataFrame:
name age gender category
0 rian 30 male mature
1 nancy 20 female youth
2 intan 20 female youth
3 rubim 12 male kids
I also suggest you to follow Python formatting conventions to make your code more readable.
CodePudding user response:
import pandas as pd
name = ["rian", "nancy", "intan", "rubim"]
age = ["30", "20", "20", "12"]
gender = ["male", "female", "female", "male"]
df = pd.DataFrame({
"name": name,
"age": age,
"gender": gender
})
df.age = df.age.astype(int)
df['category'] = df['age'].apply(lambda x: 'mature' if x > 20 else 'kids' if x < 20 else 'youth')
