I am trying to get a name from a list of websites and need only the first string or word to be more precise. I have found a way in Python but I am interested in doing it with PostgreSQL
For example I would need happy from https://www.happy.com.
I used following code but I think I am going in a wrong direction:
WITH test AS (SELECT regexp_replace('https://www.happy.com','(https://www.|https://|http://)',''))
SELECT SPLIT_PART(test,'.',0)
CodePudding user response:
You can use the following REGEXP_REPLACE:
SELECT REGEXP_REPLACE('https://www.happy.com',
'^(?:(?:ht|f)tps?://(?:www\.)?|www\.)(.*)\.[^.] $',
'\1') AS Result;
See the regex demo.
Details:
^- start of string(?:(?:ht|f)tps?://(?:www\.)?|www\.)-http://www.,https://www.,ftp://www.,ftps://www.,https://,http://,ftps://,ftp://or justwww.(.*)- Group 1 (\1): any zero or more chars as many as possible\.- a.char[^.]- one or more chars other than a.char$- end of string.
CodePudding user response:
Thank you very much Wiktor. I really appreciate it not just because you gave me the syntax but because I am fairly new to REGEX and your detailed explanation helped understand better filtering.
