Research: Text Segmentation
My research in text segmentation and structure includes: discourse
segmentation, in particular, multi-paragraph discourse segmentation as
seen in the TextTiling work, sentence
segmentation, and recognition of abbreviation definitions.
Publications
- Schwartz, A., and Hearst, M., A Simple Algorithm for Identifying
Abbreviation Definitions in Biomedical Text, in the Proceedings of the
Pacific Symposium on Biocomputing (PSB 2003) Kauai, Jan 2003. pdf
- Palmer, D., and Hearst, M., Adaptive Multilingual Sentence Boundary
Disambiguation, Computational Linguistics, 23 (2), 241-267,
June 1997.
pdf
- Palmer, D., and Hearst, M. Adaptive Sentence Boundary
Disambiguation, Proceedings of the Conference on Applied Natural
Language Processing, Stuttgart, Germany, Oct 1994.
pdf
- Hearst, M., TextTiling: Segmenting Text into Multi-Paragraph
Subtopic Passages, Computational Linguistics, 23 (1),
pp. 33-64, March 1997. pdf
- Hearst, M. Multi-Paragraph Segmentation of Expository Text,
Proceedings of the 32nd Annual Meeting of the
Association for Computational Linguistics, Las Cruces, NM, June 1994.
pdf
- Pevzner, L., and Hearst, M., A Critique and Improvement of an
Evaluation Metric for Text Segmentation,
Computational Linguistics,,
28 (1), March 2002, pp. 19-36.
pdf