Home > Blockchain >  Power Query M - return most recent column value
Power Query M - return most recent column value

Time:01-15

O365

I'm using PQ to ETL a dataset into Excel.

ID Status Date CurrentStatus
1 Active 1/1/2022 Terminated
1 Terminated 1/10/2022 Terminated

Under CurrentStatus, lookup the row's ID and find the record with the latest Date for that ID then return the Status for that record.

Current setup:

Formula is attached to the output table.

Current formula =XLOOKUP([@ID]&MAXIFS([Date],[ID],[@ID]),[ID]&[Date],[Status],"")

Problem:

Calculation takes forever

Goal:

Build the solution into the PQ M code instead of the formula.

Any guidance is appreciated, thanks.

CodePudding user response:

In powerquery, you can group on ID, and within that, sort on date then take the most recent result. Then expand the status

Load the data into powerquery with data from table/range, right click the ID column and Group by

take the default code

= Table.Group(#"Changed Type", {"ID"}, {{"Count", each Table.RowCount(_), Int64.Type}})

and replace the end to resemble this:

= Table.Group(#"Changed Type", {"ID"}, {{"data", each _, type table },     {"MaxStatus", each Table.FirstN(Table.Sort(_,{{"Date", Order.Descending}}),1)[Status]{0}}})

then use the arrow atop the new column to [x] expand the Status, Date and CurrentStatus columns

Full code:

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Status", type text}, {"Date", type date}, {"CurrentStatus", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"data", each _, type table },     {"MaxStatus", each Table.FirstN(Table.Sort(_,{{"Date", Order.Descending}}),1)[Status]{0}}}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Status", "Date", "CurrentStatus"}, {"Status", "Date", "CurrentStatus"})
in #"Expanded data"

enter image description here

  •  Tags:  
  • Related