I have the following data in Excel, which I'm importing into PowerBI.
In the short description, there is a code (immediately after IDN) in each row - I need to extract just the number. THe number is not always the same length and it may be followed by a space, or another character (a - in the screenshot).
In excel I can use: =SEARCH("IDN",A2) to find the start of the IDN text - FirstDetectIDN
I can then find the next space (NextSpace) using find again: =FIND(" ",A2,B2)
I use the same to find the NextSpace2 - so I now have the starting and end position of the spaces surrounding the number I want to extract.
But that gives me the extra characters on the end of the number ("-EOL" above in the screenshot) that I don't want.
Is there any way in PowerBI that I can replicate all of that in one new calculated column AND also only extract the number part (so for the second line, I would only want 784729 in the new calculated field).
Thanks for any suggestions,
Mark
CodePudding user response:
This type of data cleaning should be done in Power Query.
Add a new column and type in the following code:
let
a = Text.AfterDelimiter([Column1],"IDN"),
b = List.Transform({a}, each Text.Select(_, {"0".."9"}))
in b{0}
Full code:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCk4tUzBV8HTxU7A0MbcwVtBVcCxNyfTITM9QitWJVvJNLUksSszNL80rASsytzAxN7LUdfX3AaoMT03JTU1Rio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each let
a = Text.AfterDelimiter([Column1],"IDN"),
b = List.Transform({a}, each Text.Select(_, {"0".."9"}))
in b{0})
in
#"Added Custom"
CodePudding user response:
If you have digits AFTER the set of digits comprising the IDN number, then try this slightly more complicated version:
- First split on
IDN - Then split on the transition from digit to non-digit
- Note that by specifying the Added Column as
type text, we will retain any leading zeros in theIDN. If you prefer, you can specify astype numberorInt64.Typewhich will drop any leading zeros.
let
//Change next line to reflect actual data source
Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],
//Set data type
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Short description", type text}}),
//Extract the first set of digits after "IDN"
#"Added Custom" = Table.AddColumn(#"Changed Type", "IDN", each
Text.Trim(
Splitter.SplitTextByCharacterTransition({"0".."9"}, (c) => not List.Contains({"0".."9"}, c))
(Text.AfterDelimiter([Short description],"IDN")){0}), type text)
in
#"Added Custom"



