Faculty Publications

Examining temporality in document classification

Xiaolei Huang, University of Colorado Boulder
Michael J. Paul, University of Colorado Boulder

Abstract

Many corpora span broad periods of time. Language processing models trained during one time period may not work well in future time periods, and the best model may depend on specific times of year (e.g., people might describe hotels differently in reviews during the winter versus the summer). This study investigates how document classifiers trained on documents from certain time intervals perform on documents from other time intervals, considering both seasonal intervals (intervals that repeat across years, e.g., winter) and non-seasonal intervals (e.g., specific years). We show experimentally that classification performance varies over time, and that performance can be improved by using a standard domain adaptation approach to adjust for changes in time.

Publication Title

ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)

Recommended Citation

Huang, X., & Paul, M. (2018). Examining temporality in document classification. ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers), 2, 694-699. https://doi.org/10.18653/v1/p18-2110

Link to Full Text

COinS

Faculty Publications

Examining temporality in document classification

Abstract

Publication Title

Recommended Citation

Search

Browse

Author Corner

Libraries

Faculty Publications

Examining temporality in document classification

Authors

Abstract

Publication Title

Recommended Citation

Share

Search

Browse

Author Corner

Libraries