I need a url validator regex with this criteria:
- protocol (HTTP, HTTPS) is optional. But if any protocol is given, it must be in the correct format, i.e. protocol:domain, or protocol://domain.
- www is optional
- it's possible to use direct IP address for this.
So based on the criteria, these should pass:
- http://www.google.com
- google.com
- abc.def.ghi/hij
- https:216.239.38.120
- 216.239.38.120
These should not pass:
- hello
- hello/world
- abc://def.ghi
- ftp:google.com
The closest regex I've found is from here:
^((?:.|\n)*?)((http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9] ([\-\.]{1}[a-z0-9] )([-A-Z0-9.] )(/[-A-Z0-9 &@#/%=~_|!:,.;]*)?(\?[A-Z0-9 &@#/%=~_|!:,.;]*)?)
But unfortunately, google.com doesn't pass. It needs to have www. as a prefix. Can you improve this regex so www. becomes optional?
CodePudding user response:
It looks like the following pattern matches your criteria:
^(?:https?:\/\/(?:www\.)?|https:(?:\/\/)?)?\w (?:[-.]\w ) (?:\/[^\/\s] )*$
See the regex demo. Details:
^- start of the string(?:https?:\/\/(?:www\.)?|https:(?:\/\/)?)?- an optional sequence of:https?:\/\/(?:www\.)?-httporhttps,://and then an optionalwww.substring|- orhttps:(?:\/\/)?-https:and then an optional//string
\w- one or more word chars(?:[-.]\w )- one or more sequences of a.or-followed with one or more word chars(?:\/[^\/\s] )*- an optional sequence of a/an then one or more chars other than/and whitespace$- end of string.
