Please see the SQLFiddle example:
http://sqlfiddle.com/#!4/abd6d/1
here are a few example address:
MINNEAPOLIS MN 55450
MINNAPOLIS MN 55439-8136
BETHANY OK 73008
Hillsboro Oregon 97124
Not all of them are separated by spaces, but enough that I I think that is the method I want to approach.
running Oracle 11g
CodePudding user response:
I do not think this is easily feasible by sql. You will need to rearrange the raw data and the table schema by adding more columns. The appraoch I recommand is string manipulation by using other programming language, for example, C#.
CodePudding user response:
See if such an approach helps.
TEMPCTE finds position of the first digit incitystatezipcolumnzip: that's the substring that starts from thezip_positionstate: nested functionssubstrselects everything up to thezip_position(e.g. "SNOHOMISH WA ")trimremoves trailing spacesregexp_substrextracts the last word from that substring (e.g. "WA")
city: substring from the 1st character, up to position of the second space character starting from the back of the string (seeinstr's parameters)
For sample data you posted (LA added), that would look as follows:
SQL> with temp as
2 (select p.*,
3 regexp_instr(citystatezip, '\d') zip_position
4 from po_header p
5 )
6 select t.po_number, t.customer, t.citystatezip,
7 substr(t.citystatezip, t.zip_position) zip,
8 regexp_substr(trim(substr(t.citystatezip, 1, t.zip_position - 1)), '\w $') state,
9 trim(substr(t.citystatezip, 1, instr(t.citystatezip, ' ', -1, 2))) city
10 from temp t;
PO_NUMBER CUSTOME CITYSTATEZIP ZIP STATE CITY
---------- ------- ------------------------------ ---------- ---------- -----------
1 John SNOHOMISH WA 98290 98290 WA SNOHOMISH
2 Jen MINNAPOLIS MN 55439-8136 55439-8136 MN MINNAPOLIS
3 Jillian BETHANY OK 73008 73008 OK BETHANY
4 Jordan Hillsboro Oregon 97124 97124 Oregon Hillsboro
5 Scott Los Angeles CA 12345 12345 CA Los Angeles
SQL>
Is it perfect? Certainly not, but the final solution depends on much more sample data. Generally speaking, data model is just wrong - you should have split those information into separate columns in the first place.
