This article will guide you through the creation of a simple auto-correct program in python. This project is to create two different spelling recommenders, that will be able to take in a user’s input and recommend a correctly spelled word. Very cool!

Note: Test inputs: [‘cormulent’, ‘incendenece’, ‘validrate’].

Packages

This effort will be largely centered around the use of the _nltk _library. nltk stands for Natural Language Toolkit, and more info about what can be done with it can be found here.

Specifically, we’ll be using the words, edit_distance, jaccard_distance and ngrams objects.

**edit_distance, jaccard_distance **refer to metrics which will be used to determine word that is most similar to the user’s input

An n-gram is a contiguous sequence of n items from a given sample of text or speech. For example: “White House” is a bigram and carries a different meaning from “white house”.

Additionally, we’ll also use pandas as a way to create an indexed series of the list of correct words.

words.words() gives a list of correctly spelled words which has been included in the nltk library as the word object. spellings_series is an indexed series of these words, with the output shown below the code chunk.

Image for post

#data-science #data-analysis #coding #nlp #text-mining

Create autocorrect in Python!
6.20 GEEK