|The Internet is growing
with an increasing rate, and it is obvious that it will be difficult to
search for information in this gigantic digital library. The estimated
size of the Internet, from February 1999, indicates that there are about
800 million pages on the World-Wide Web, on about 3 million servers .
Retrieval of text information is a difficult task. The problem can be either that the information is misinterpreted because of natural language ambiguities or the information need can be imprecisely or vaguely defined by the user . This calls for improved automatic methods for searching and organizing text documents so information of interest can be accessed fast and accurately.
This introduction is a short overview of methods in Information Retrieval (IR). We start off by looking at the widely used Boolean retrieval method. Then the vector space model will be discussed followed by the extension of the vector space model called Latent Semantic Indexing. In the end the clustering and classification of documents will be discussed.