A stop list, or a stopword list, is a pre-defined list of words filtered out from text during information retrieval and indexing processes.
Usage
The purpose of a stop list is to exclude words that are so common and frequent in a language that they offer little value in distinguishing one document from another.
Operators
Stop lists usually include words like “a,” “an,” “the,” “in,” “on,” “of,” and other prepositions, conjunctions, and articles. However, these words do not convey significant meaning on their own, and their inclusion in a search query or index would result in many irrelevant search results.
Language processing
In natural language processing (NLP), stop lists are often used to preprocess textual data before tasks such as document classification, sentiment analysis, or topic modeling.
Removing stop words from a document or a corpus makes the remaining words more informative and reveals the underlying topics and themes.