Modern Information Retrieval
Chapter 1: Introduction
Many other books have been written on information retrieval, and due to the current widespread interest in the subject, new books have appeared recently. In the following, we briefly compare our book with these previously published works.
Classic references in the field of information retrieval are the books by van Rijsbergen [v79] and Salton and McGill [sm83]. Our distinction between data and information retrieval is borrowed from the former. Our definition of the information retrieval process is influenced by the latter. However, almost 20 years later, both books are now outdated and do not cover many of the new developments in information retrieval.
Three more recent and also well known references in information retrieval are the book edited by Frakes and Baeza-Yates [fby92], the book by Witten, Moffat, and Bell [WMB94], and the book by Lesk [lesk97]. All these three books are complementary to this book. The first is focused on data structures and algorithms for information retrieval and is useful wheneverquick prototyping of a known algorithm is desired. The second is focused on indexing and compression, and covers images besides text. For instance, our definition of a textual image is borrowed from it. The third is focused on digital libraries and practical issues such as history, distribution, usability, economics, and property rights. On the issue of computer-centered and user-centered retrieval, a generic book on information systems that takes the latter view is due to Allen [alle].
There are other complementary books for specific chapters. For example, there are many books on IR and hypertext. The same is true for generic or specific multimedia retrieval, as images, audio or video. Although not an information retrieval title, the book by Rosenfeld and Morville [rm98] on information architecture of the Web, is a good complement to our chapter on searching the Web. The book by Menasce and Almeida [ma98] demonstrates how to use queueing theory for predicting Web server performance. In addition, there are many books that explain how to find information on the Web and how to use search engines.
The reference edited by Sparck Jones and Willet [sjw97], which was long awaited, is really a collection of papers rather than an edited book. The coherence and breadth of coverage in our book makes it more appropriate as a textbook in a formal discipline. Nevertheless, this collection is a valuable research tool. A collection of papers on cross-language information retrieval was recently edited by Grefenstette [g98]. This book is a good complement to ours for people interested in this particular topic. Additionally, a collection focused on intelligent IR was edited recently by Maybury [m97], and another collection on natural language IR edited by Strzalkowski will appear soon [strzalkowski99].
The book by Korfhage [k97] covers a lot less material and its coverage is not as detailed as ours. For instance, it includes no detailed discussion of digital libraries, the Web, multimedia, or parallel processing. Similarly, the books by Kowalski [ko97] and Shapiro et al. [sbf97] do not cover these topics in detail, and have a different orientation. Finally, the recent book by Grossman and Frieder [gf98] does not discuss the Web, digital libraries, or visual interfaces.
For people interested in research results, the main journals on IR are: Journal of the American Society of Information Sciences (JASIS) published by Wiley and Sons, ACM Transactions on Information Systems, Information Processing & Management (IP&M, Elsevier), Information Systems (Elsevier), Information Retrieval (Kluwer), and Knowledge and Information Systems (Springer). The main conferences are: ACM SIGIR International Conference on Information Retrieval, ACM International Conference on Digital Libraries (ACM DL), ACM Conference on InformationKnowledge and Management (CIKM), and Text REtrieval Conference (TREC). Regarding events of regional influence, we would like to acknowledge the SPIRE (South American Symposium on String Processing and Information Retrieval) symposium.