Well, wondering what is NLTK? the Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. It was developed by Steven Bird and Edward Loper in the Department of Computer and Information Science at the University of Pennsylvania.
1.convert text to lower case
2. word tokenize
3. sent tokenize
4. stop words removal
5. lemma
6. stem
7. get word frequency
8. pos tags
9. NER
_install _Python
import nltk in-order to use its functions
import nltk
It is necessary to convert the text to lower case as it is case sensitive.
text = “This is a Demo Text for NLP using NLTK. Full form of NLTK is Natural Language Toolkit”
lower_text = text.lower()
print (lower_text)
[OUTPUT]: this is a demo text for nlp using nltk. full form of nltk is natural language toolkit
Tokenize sentences to get the tokens of the text i.e breaking the sentences into words.
text = “This is a Demo Text for NLP using NLTK. Full form of NLTK is Natural Language Toolkit”
word_tokens = nltk.word_tokenize(text)
print (word_tokens)
[OUTPUT]: ['This', 'is', 'a', 'Demo', 'Text', 'for', 'NLP', 'using', 'NLTK', '.', 'Full', 'form', 'of', 'NLTK', 'is', 'Natural', 'Language', 'Toolkit']
#machine-learning #python #naturallanguageprocessing #nlp #nltk