Home > OS >  Snowflake SQL: regex_replace for CamelCase strings
Snowflake SQL: regex_replace for CamelCase strings

Time:01-28

Hi I need to transform the following string as follows:

Before: 6621 Lake Valley DrMemphis, TN 38141

After: 6621 Lake Valley Dr

Additional ways the data comes in: 143 Evergreen Forest Court (this one is fine as is)

Need to extract address only for these as well:

  1. 326 Hambrick Park Fayetteville, GA 30215
  2. RE: Owner's Policy - 112 Shagbark Ln Mooresville, NC 28115
  3. RE: Owner's Policy - 540 Clearbrook Dr Covington, GA 30016
  4. Closed 9/1/21 4421 Home Stakes Dr Parkton, NC 28371
  5. RP 9/16- 352 Hampton St Elloree, SC 29047
  6. RP: 9/15- 124 Lake Grove Rd Simpsonville, SC 29681
  7. FHA 3/2/22- 6083 Holiday Blvd Forest Park, GA 30297
  8. RD 10/1/21 Roxanne Sellers- 311 Woodbrook Ln Marietta, GA 30068
  9. 4104 Flat Trl- Ricardo Reeder

How can I accomplish this using Snowflake SQL? I'm assuming regex_replace is in order? Can anybody help me out?

CodePudding user response:

If you only give us one example, we can only solve for that example:

select regexp_replace(x, '(.*[a-z])([A-Z].*)', '\\1')
from (select '6621 Lake Valley DrMemphis, TN 38141' x)

If you need help with a more general case, please start a new question with more sample inputs and outputs.

CodePudding user response:

We can try using REGEXP_REPLACE with the help of a capture group:

WITH yourTable AS (
    SELECT '6621 Lake Valley DrMemphis, TN 38141' AS address
)

SELECT REGEXP_REPLACE(address, '(\d ( [A-Z][a-z] )*).*', '\1') AS address_out
FROM yourTable;

Demo

The above working demo is for Postgres, not Snowflake. You might have to use $1 instead of \1 for the first capture group, but the rest should run unaltered.

  •  Tags:  
  • Related