O365
I'm using PQ to ETL a dataset into Excel.
| ID | Status | Date | CurrentStatus |
|---|---|---|---|
| 1 | Active | 1/1/2022 | Terminated |
| 1 | Terminated | 1/10/2022 | Terminated |
Under CurrentStatus, lookup the row's ID and find the record with the latest Date for that ID then return the Status for that record.
Current setup:
Formula is attached to the output table.
Current formula =XLOOKUP([@ID]&MAXIFS([Date],[ID],[@ID]),[ID]&[Date],[Status],"")
Problem:
Calculation takes forever
Goal:
Build the solution into the PQ M code instead of the formula.
Any guidance is appreciated, thanks.
CodePudding user response:
In powerquery, you can group on ID, and within that, sort on date then take the most recent result. Then expand the status
Load the data into powerquery with data from table/range, right click the ID column and Group by
take the default code
= Table.Group(#"Changed Type", {"ID"}, {{"Count", each Table.RowCount(_), Int64.Type}})
and replace the end to resemble this:
= Table.Group(#"Changed Type", {"ID"}, {{"data", each _, type table }, {"MaxStatus", each Table.FirstN(Table.Sort(_,{{"Date", Order.Descending}}),1)[Status]{0}}})
then use the arrow atop the new column to [x] expand the Status, Date and CurrentStatus columns
Full code:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Status", type text}, {"Date", type date}, {"CurrentStatus", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"data", each _, type table }, {"MaxStatus", each Table.FirstN(Table.Sort(_,{{"Date", Order.Descending}}),1)[Status]{0}}}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Status", "Date", "CurrentStatus"}, {"Status", "Date", "CurrentStatus"})
in #"Expanded data"

