Note 1:_ A text, which could be a word or a sentence, is known as a document in NLP._
Note 2:_ A collection of such documents is known as a document corpus._
Let there be two documents in the document corpus as given below:
We create a dictionary(or an array) of all the unique words in the document corpus as:
[This, car, drives, good, and, is, expensive, not]
Note 3:_ Generally the BoW creates sparse vectors. In a sparse vector, most of the dimensions have 0 value._
Let vectors v1 and v2 correspond to document 1 and document 2 respectively. Then these vectors are represented as:
v1 = [1 1 1 1 1 1 1 0]
v2 = [1 1 1 1 1 1 1 1]
#naturallanguageprocessing #machine-learning #data-science