Marti A. Hearst

Professor

University of California, Berkeley

Research: Text Segmentation

My research in text segmentation and structure includes: discourse segmentation, in particular, multi-paragraph discourse segmentation as seen in the TextTiling work, sentence segmentation, and recognition of abbreviation definitions.

Publications
  • Schwartz, A., and Hearst, M., A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text, in the Proceedings of the Pacific Symposium on Biocomputing (PSB 2003) Kauai, Jan 2003. pdf
  • Palmer, D., and Hearst, M., Adaptive Multilingual Sentence Boundary Disambiguation, Computational Linguistics, 23 (2), 241-267, June 1997. pdf  
  • Palmer, D., and Hearst, M. Adaptive Sentence Boundary Disambiguation, Proceedings of the Conference on Applied Natural Language Processing, Stuttgart, Germany, Oct 1994. pdf 
  • Hearst, M., TextTiling: Segmenting Text into Multi-Paragraph Subtopic Passages, Computational Linguistics, 23 (1), pp. 33-64, March 1997.   pdf 
  • Hearst, M. Multi-Paragraph Segmentation of Expository Text, Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, NM, June 1994. pdf 
  • Pevzner, L., and Hearst, M., A Critique and Improvement of an Evaluation Metric for Text Segmentation, Computational Linguistics,, 28 (1), March 2002, pp. 19-36. pdf