In this digital world, text data is everywhere, right from tweets in twitter to parsing some text in documents, everything is associated with the text. Many machine learning based products make effective use of this text data to make amazing technologies based on topics like sentiment analysis, topic modeling, relation extraction, etc.

As text data is everywhere, therefore, it is important for us to focus and create algorithms that can help us retrieve data in minimal time with optimal relevancy. For e.g, suppose if we type in google browser “Machine Learning algorithms pdf” we would get some relevant documents as follows

Upon querying we will get the relevant documents in optimal time. So what we can infer from this process of the query is, we want the system to give us results of a query in minimal time, the relevancy of documents also becomes an important factor here, you will not get psychology documents when typing machine learning algorithms because the subject is not that much relevant to the given query. So building one such system is difficult because of several factors and tradeoff which we encounter when we build such systems for a large corpus of documents or textual data retrieval.

Now knowing the complexity of these systems let us discuss what are the differences between text retrieval and database retrieval. But before that let us go through some formal definition of these retrieval modes.

Text retrieval is a task where the system would respond to a user’s query with relevant documents. It is a preprocessor for text mining.

Database retrieval means obtaining data from a database management system such as ODBMS (wikipedia).

Now, let us compare both of these over several factors:

#text-mining #machine-learning #information-retrieval #data-mining #database

Text Retrieval vs Database Retrieval
3.10 GEEK