Modern Information Retrieval
Chapter 10: User Interfaces and Visualization


Contents

next up previous
Next: 3. Examples, Dialogs, and Up: 2. Overviews Previous: 3. Evaluations of Graphical

    
4. Co-citation Clustering for Overviews

collection overviews!co-citation clustering co-citation clustering

Citation analysis has long been recognized as a way to show an overview of the contents of a collection [#!white89!#]. The main idea is to determine `centrally-located' documents based on co-citation patterns. There are different ways to determine citation patterns: one method is to measure how often two articles are cited together by a third. Another alternative is to pair articles that cite the same third article. In both cases the assumption is that the paired articles share some commonalities. After a matrix of co-citations is built, documents are clustered based on the similarity of their co-citation patterns. The resulting clusters are interpreted to indicate dominant themes within the collection. Clustering can focus on the authors of the documents rather than the contents, to attempt to identify central authors within a field. This idea has recently been implemented using Web-based documents in the Referral Web project [#!kautz97!#]. The idea has also been applied to Web pages, using Web link structure to identify major topical themes among Web pages [#!larson96b!#,#!pirolli96b!#]. A similar idea, but computed a different way, is used to explicitly identify pages that act as good starting points for particular topics (called `authority pages' by Kleinberg [#!kleinberg98!#]).

collection overviews|)


next up previous
Next: 3. Examples, Dialogs, and Up: 2. Overviews Previous: 3. Evaluations of Graphical


Modern Information Retrieval © Addison-Wesley-Longman Publishing co.
1999 Ricardo Baeza-Yates, Berthier Ribeiro-Neto