Home > Blockchain >  'STOPWORDS' is not defined after importing stopwords
'STOPWORDS' is not defined after importing stopwords

Time:01-15

Here is my code, I had imported stopword, but its shows stopword is not defined.

import nltk
from nltk.corpus import stopwords
#Create stopword list:
stopwords = set(STOPWORDS)

This gives:

NameError: name 'STOPWORDS' is not defined

CodePudding user response:

You need to download the right stopwords you want to use. For example if you simply want to print the stopwords which are used in english:

import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')
print(stopwords.words('english'))

This should give you the output of english stopwords like 'i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves',....]

CodePudding user response:

As pointed out earlier, the first time you need to include the following in your code, in order to download the list to your computer:

nltk.download('stopwords')

Then, you can load, for example, the English stop words list as follows:

stop_words = list(stopwords.words('english'))

and even extend it, if you need to:

stop_words.extend(["best", "item", "fast"])

Use it to remove stop words from text:

from nltk.tokenize import word_tokenize
# tokenise the text and remove stop words
word_tokens = word_tokenize(text)
clean_word_data = [w for w in word_tokens if not w.lower() in stop_words]
  •  Tags:  
  • Related