I have just started programming, pardon in any misuse of terms. Thanks for the help in advance. English is not my first language, thank you for understanding.
I am using Pandas in Pyhton.
I have created a datalist using df = pd.read_csv from a csv file.This is what the CSV file contains
Year,A,B,C,D,E,F
2007,7632014,4643033,206207,626668,89715,18654926
2008,6718487,4220161,379049,735494,58535,29677697
2009,1226858,5682198,482776,1015181,138083,22712088
2010,978925,2229315,565625,1260765,146791,15219378
2011,1500621,2452712,675770,1325025,244073,19697549
2012,308064,2346778,591180,1483543,378998,33030888
2013,275019,4274425,707344,1664747,296136,17503798
2014,226634,3124281,891466,1807172,443671,16023363
2015,2171559,3474825,1144862,1858838,585733,16778858
2016,767713,4646350,2616322,1942102,458543,13970498
2017,759016,4918320,1659303,2001220,796343,9730659
2018,687308,6057191,1524474,2127583,1224471,19570540
I know how to select a specific row/column in the dataframe using:
data_2012 = (df.loc[0:12, 1:7].values.tolist()[6])
data_A = (df.loc[5:12, 0:10][1].values.tolist())
I need to find max, min values in the column list data_A so I have created
maximum_A = max(data_A)
minimum_A = min(data_A)
I also have created a list for all the needed rows
data_2011 = (df.loc[0:12, 1:7].values.tolist()[5])
data_2012 = (df.loc[0:12, 1:7].values.tolist()[6])
data_2013 = (df.loc[0:12, 1:7].values.tolist()[7])
data_2014 = (df.loc[0:12, 1:7].values.tolist()[8])
data_2015 = (df.loc[0:12, 1:7].values.tolist()[9])
data_2016 = (df.loc[0:12, 1:7].values.tolist()[10])
data_2017 = (df.loc[0:12, 1:7].values.tolist()[11])
data_2018 = (df.loc[0:12, 1:7].values.tolist()[12])
I tried to make them into a single list as shown
data_allyears = (data_2011, data_2012, data_2013, data_2014, data_2015, data_2016, data_2017, data_2018)
The issue is, how do I select an item from that particular row that has the value from the min,max value. Let's say the max value is in the year 2012, how do I automatically print the year itself where the max value is. I have tried this but nothing happened:
for a,b,c in zip(maximum_A, minimum_A, data_allyears):
if a == c:
print(f"${a} in year {c}")
CodePudding user response:
Try this ; lets call your dataframe df :
df.loc[df.Year == df.Year.max()]
CodePudding user response:
how do I automatically print the year itself where the max value is
# get year of maximum of A
column = 'A'
year = df.loc[df[column] == max(df[column]), ['Year']].values[0][0]
print(year)
CodePudding user response:
Set Year as index and find max value for each year
df1 = df.set_index('Year').assign(Min=lambda x: x.max(axis=1),
Max=lambda x: x.min(axis=1))
print(df1)
# Output
A B C D E F Min Max
Year
2007 7632014 4643033 206207 626668 89715 18654926 18654926 89715
2008 6718487 4220161 379049 735494 58535 29677697 29677697 58535
2009 1226858 5682198 482776 1015181 138083 22712088 22712088 138083
2010 978925 2229315 565625 1260765 146791 15219378 15219378 146791
2011 1500621 2452712 675770 1325025 244073 19697549 19697549 244073
2012 308064 2346778 591180 1483543 378998 33030888 33030888 308064
2013 275019 4274425 707344 1664747 296136 17503798 17503798 275019
2014 226634 3124281 891466 1807172 443671 16023363 16023363 226634
2015 2171559 3474825 1144862 1858838 585733 16778858 16778858 585733
2016 767713 4646350 2616322 1942102 458543 13970498 13970498 458543
2017 759016 4918320 1659303 2001220 796343 9730659 9730659 759016
2018 687308 6057191 1524474 2127583 1224471 19570540 19570540 687308
