Information retrieval or IR is software that analyzes, organizes, stores information, and gives out relevant documents based on keywords. It works particularly well with texts.
The main aim of information retrieval is to give the most relevant documents based on the search inquiry.
IR does not give direct answers to the questions but it offers the location of relevant documents that may contain this answer. So users give a search inquiry in natural language, IR analyzes the existing documents and shows those that may satisfy the user’s needs.
It has the following features:
- inverted index (it states all documents that have a keyword and the number of times the keyword was used in it. The more times the keyword is used, the higher it might end up in the search results list)
- stop word elimination (it does not pay attention to words that have less semantic meaning, for example, a, the, in, for)
- stemming (cutting off the ending of the word and searching for stems so that search “moves” does not eliminate all the results with the word “move”)
- query expansion (system has term cluster. If one term used in the search has the exact same meaning as one in the memory of the system, it will show results on both terms)
Some famous examples of information retrieval are search engines like Google or Bing, internal email search, search in the library catalogs, etc.