i got a list of ids and a lists of dates. Both are single entries of separate pandas dataframe columns. Each date corresponds to an id. Something like:
[852634, 727417, 881231] [2018-05-29, 2015-11-23, 2019-06-26]
How can i order the dates (ascending or descending, does not matter) and export the same ordering to the IDs?
The wanted result is:
[727417, 852634, 881231] [ 2015-11-23, 2018-05-29, 2019-06-26]
Thank you in advance for all the suggestions, Alessandro
CodePudding user response:
Use numpy -
l1_key = np.argsort(l1)
l1_sorted = np.array(l1)[l1_key]
l2_sorted = np.array(l2)[l1_key]
Output
print(l1_sorted)
print(l2_sorted)
[727417 852634 881231]
['2015-11-23' '2018-05-29' '2019-06-26']
CodePudding user response:
Zip...
>>> x = [852634, 727417, 881231]
>>> y = ["2018-05-29", "2015-11-23", "2019-06-26"]
>>> list(zip(y, x))
[('2018-05-29', 852634), ('2015-11-23', 727417), ('2019-06-26', 881231)]
sort...
>>> sorted(zip(y,x))
[('2015-11-23', 727417), ('2018-05-29', 852634), ('2019-06-26', 881231)]
and unzip.
>>> [x for _, x in sorted(zip(y,x))]
[727417, 852634, 881231]
This is an example of a general technique called a Schwartzian transform. You decorate the list of IDs you want to sort with the corresponding dates, sort the decorated list, then extract (undecorated) the original values from the result.
CodePudding user response:
you can do list.sort() for the int
ids = ids.sort()
you could use datetime to compare the second:
from datetime import datetime
dates = [datetime.strftime(date,"Y%-M%-D%") for date in dates]
CodePudding user response:
If you already have a dataframe, it's probably much easier to .explode() 'em and .sort_values() there before export!
>>> import pandas as pd
>>> df = pd.DataFrame({"ids": [[852634, 727417, 881231], [90,100,110,115]], "dates": [["2018-05-29", "2015-11-23", "2019-06-26"], ["2015-01-01", "2021-01-01", "2020-01-01", "2021-01-01"]]})
>>> df
ids dates
0 [852634, 727417, 881231] [2018-05-29, 2015-11-23, 2019-06-26]
1 [90, 100, 110, 115] [2015-01-01, 2021-01-01, 2020-01-01, 2021-01-01]
>>> df.explode(["ids", "dates"]).sort_values("dates")
ids dates
1 90 2015-01-01
0 727417 2015-11-23
0 852634 2018-05-29
0 881231 2019-06-26
1 110 2020-01-01
1 100 2021-01-01
1 115 2021-01-01
>>> df.explode(["ids", "dates"]).sort_values("dates")["ids"].to_numpy()
array([90, 727417, 852634, 881231, 110, 100, 115], dtype=object)
