I am VERY new to regex in js and having a very hard time manipulating them to do what I am looking for.
I have a series of strings that I am trying to strip of unusual characters, spaces, newlines, etc. and put them into arrays where each entry is a word consisting of only alphanumeric characters.
For example:
let testString = "this*is " "a\n" " test string "
test = testString.split(/\W/)
console.log(test)
Yields [ 'this', 'is', 'a', '', 'test', 'string', '' ]
But ideally I would like it to yield [ 'this', 'is', 'a', 'test', 'string']
I can achieve the desired result by adding .filter(word => word !== '') to the end, but I am wondering if there is a way to do this using only regular expressions? Also would it be necessary to add a global flag to the regex?
Thanks in advance for any input!
CodePudding user response:
Just a simple one-liner:
function getWords = s => ( s ?? "" ).match( /\w /g );
Then...
const words = getWords( 'this, that & the other;etc.';
yields
[ 'this', 'that', 'the', 'other', 'etc' ]
CodePudding user response:
Use the trim function to remove additional space around the sentence.
More Information: https://www.w3schools.com/jsref/jsref_trim_string.asp
test = testString.trim().split(/\W/);
