Information Extraction: Learning Lexico-Syntactic PatternsAutoslog-TS, Riloff, AAAI ‘96
Start with known relevant articles
Create huge set of patterns of form:
- <subj> passive-verb <victim> was murdered
- active-verb <dobj> bombed <target>
Compute relevance rate:
- Pr(relevant text|text has patterni) = rel - freqi / total - freqi
Rank patterns in order of importance
- relevance rate * log2(freqi), rr > 0.5
Human judge reviewed top patterns
Apply all patterns to each text